ClickHouse/docs/en/query_language/select.md

# SELECT Queries Syntax

`SELECT` performs data retrieval.

``` sql
SELECT [DISTINCT] expr_list
    [FROM [db.]table | (subquery) | table_function] [FINAL]
    [SAMPLE sample_coeff]
    [ARRAY JOIN ...]
    [GLOBAL] [ANY|ALL] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER] JOIN (subquery)|table USING columns_list
    [PREWHERE expr]
    [WHERE expr]
    [GROUP BY expr_list] [WITH TOTALS]
    [HAVING expr]
    [ORDER BY expr_list]
    [LIMIT [n, ]m]
    [UNION ALL ...]
    [INTO OUTFILE filename]
    [FORMAT format]
    [LIMIT [offset_value, ]n BY columns]
```

All the clauses are optional, except for the required list of expressions immediately after SELECT.
The clauses below are described in almost the same order as in the query execution conveyor.

If the query omits the `DISTINCT`, `GROUP BY` and `ORDER BY` clauses and the `IN` and `JOIN` subqueries, the query will be completely stream processed, using O(1) amount of RAM.
Otherwise, the query might consume a lot of RAM if the appropriate restrictions are not specified: `max_memory_usage`, `max_rows_to_group_by`, `max_rows_to_sort`, `max_rows_in_distinct`, `max_bytes_in_distinct`, `max_rows_in_set`, `max_bytes_in_set`, `max_rows_in_join`, `max_bytes_in_join`, `max_bytes_before_external_sort`, `max_bytes_before_external_group_by`. For more information, see the section "Settings". It is possible to use external sorting (saving temporary tables to a disk) and external aggregation. `The system does not have "merge join"`.

### FROM Clause

If the FROM clause is omitted, data will be read from the `system.one` table.
The 'system.one' table contains exactly one row (this table fulfills the same purpose as the DUAL table found in other DBMSs).

The FROM clause specifies the table to read data from, or a subquery, or a table function; ARRAY JOIN and the regular JOIN may also be included (see below).

Instead of a table, the SELECT subquery may be specified in brackets.
In this case, the subquery processing pipeline will be built into the processing pipeline of an external query.
In contrast to standard SQL, a synonym does not need to be specified after a subquery. For compatibility, it is possible to write 'AS name' after a subquery, but the specified name isn't used anywhere.

A table function may be specified instead of a table. For more information, see the section "Table functions".

To execute a query, all the columns listed in the query are extracted from the appropriate table. Any columns not needed for the external query are thrown out of the subqueries.
If a query does not list any columns (for example, SELECT count() FROM t), some column is extracted from the table anyway (the smallest one is preferred), in order to calculate the number of rows.

The FINAL modifier can be used only for a SELECT from a CollapsingMergeTree table. When you specify FINAL, data is selected fully "collapsed". Keep in mind that using FINAL leads to a selection that includes columns related to the primary key, in addition to the columns specified in the SELECT. Additionally, the query will be executed in a single stream, and data will be merged during query execution. This means that when using FINAL, the query is processed more slowly. In most cases, you should avoid using FINAL. For more information, see the section "CollapsingMergeTree engine".

### SAMPLE Clause {#select-sample-clause}

The `SAMPLE` clause allows for approximated query processing.

When data sampling is enabled, the query is not performed on all the data, but only on a certain fraction of data (sample). For example, if you need to calculate statistics for all the visits, it is enough to execute the query on the 1/10 fraction of all the visits and then multiply the result by 10.

Approximated query processing can be useful in the following cases:

- When you have strict timing requirements (like <100ms) but you can't justify the cost of additional hardware resources to meet them.
- When your raw data is not accurate, so approximation doesn't noticeably degrade the quality.
- Business requirements target approximate results (for cost-effectiveness, or in order to market exact results to premium users).

!!! note
    You can only use sampling with the tables in the [MergeTree](../operations/table_engines/mergetree.md) family, and only if the sampling expression was specified during table creation (see [MergeTree engine](../operations/table_engines/mergetree.md#table_engine-mergetree-creating-a-table)).

The features of data sampling are listed below:

- Data sampling is a deterministic mechanism. The result of the same `SELECT .. SAMPLE` query is always the same.
- Sampling works consistently for different tables. For tables with a single sampling key, a sample with the same coefficient always selects the same subset of possible data. For example, a sample of user IDs takes rows with the same subset of all the possible user IDs from different tables. This means that you can use the sample in subqueries in the [IN](#select-in-operators) clause. Also, you can join samples using the [JOIN](#select-join) clause.
- Sampling allows reading less data from a disk. Note that you must specify the sampling key correctly. For more information, see [Creating a MergeTree Table](../operations/table_engines/mergetree.md#table_engine-mergetree-creating-a-table).

For the `SAMPLE` clause the following syntax is supported:

| SAMPLE&#160;Clause&#160;Syntax | Description |
| ---------------- | --------- |
| `SAMPLE k` | Here `k` is the number from 0 to 1.</br>The query is executed on `k` fraction of data. For example, `SAMPLE 0.1` runs the query on 10% of data. [Read more](#select-sample-k)|
| `SAMPLE n` | Here `n` is a sufficiently large integer.</br>The query is executed on a sample of at least `n` rows (but not significantly more than this). For example, `SAMPLE 10000000` runs the query on a minimum of 10,000,000 rows. [Read more](#select-sample-n) |
| `SAMPLE k OFFSET m` | Here `k` and `m` are the numbers from 0 to 1.</br>The query is executed on a sample of `k` fraction of the data. The data used for the sample is offset by `m` fraction. [Read more](#select-sample-offset) |


#### SAMPLE k {#select-sample-k}

Here `k` is the number from 0 to 1 (both fractional and decimal notations are supported). For example, `SAMPLE 1/2` or `SAMPLE 0.5`.

In a `SAMPLE k` clause, the sample is taken from the `k` fraction of data. The example is shown below:

``` sql
SELECT
    Title,
    count() * 10 AS PageViews
FROM hits_distributed
SAMPLE 0.1
WHERE
    CounterID = 34
GROUP BY Title
ORDER BY PageViews DESC LIMIT 1000
```

In this example, the query is executed on a sample from 0.1 (10%) of data. Values of aggregate functions are not corrected automatically, so to get an approximate result, the value `count()` is manually multiplied by 10.

#### SAMPLE n {#select-sample-n}

Here `n` is a sufficiently large integer. For example, `SAMPLE 10000000`.

In this case, the query is executed on a sample of at least `n` rows (but not significantly more than this). For example, `SAMPLE 10000000` runs the query on a minimum of 10,000,000 rows.

Since the minimum unit for data reading is one granule (its size is set by the `index_granularity` setting), it makes sense to set a sample that is much larger than the size of the granule.

When using the `SAMPLE n` clause, you don't know which relative percent of data was processed. So you don't know the coefficient the aggregate functions should be multiplied by. Use the `_sample_factor` virtual column to get the approximate result.

The `_sample_factor` column contains relative coefficients that are calculated dynamically. This column is created automatically when you [create](../operations/table_engines/mergetree.md#table_engine-mergetree-creating-a-table) a table with the specified sampling key. The usage examples of the `_sample_factor` column are shown below.

Let's consider the table `visits`, which contains the statistics about site visits. The first example shows how to calculate the number of page views:

``` sql
SELECT sum(PageViews * _sample_factor)
FROM visits
SAMPLE 10000000
```   

The next example shows how to calculate the total number of visits:

``` sql
SELECT sum(_sample_factor)
FROM visits
SAMPLE 10000000
```  

The example below shows how to calculate the average session duration. Note that you don't need to use the relative coefficient to calculate the average values.

``` sql
SELECT avg(Duration)
FROM visits
SAMPLE 10000000
```  

#### SAMPLE k OFFSET m {#select-sample-offset}

Here `k` and `m` are numbers from 0 to 1. Examples are shown below.

**Example 1**

``` sql
SAMPLE 1/10
```

In this example, the sample is 1/10th of all data:

`[++------------------]`

**Example 2**

``` sql
SAMPLE 1/10 OFFSET 1/2
```

Here, a sample of 10% is taken from the second half of the data.

`[----------++--------]`

### ARRAY JOIN Clause {#select-array-join-clause}

Allows executing `JOIN` with an array or nested data structure. The intent is similar to the [arrayJoin](functions/array_join.md#functions_arrayjoin) function, but its functionality is broader.

``` sql
SELECT <expr_list>
FROM <left_subquery>
[LEFT] ARRAY JOIN <array>
[WHERE|PREWHERE <expr>]
...
```

You can specify only a single `ARRAY JOIN` clause in a query.

The query execution order is optimized when running `ARRAY JOIN`. Although `ARRAY JOIN` must always be specified before the `WHERE/PREWHERE` clause, it can be performed either before `WHERE/PREWHERE` (if the result is needed in this clause), or after completing it (to reduce the volume of calculations). The processing order is controlled by the query optimizer.

Supported types of `ARRAY JOIN` are listed below:

- `ARRAY JOIN` - In this case, empty arrays are not included in the result of `JOIN`.
- `LEFT ARRAY JOIN` - The result of `JOIN` contains rows with empty arrays. The value for an empty array is set to the default value for the array element type (usually 0, empty string or NULL).

The examples below demonstrate the usage of the `ARRAY JOIN` and `LEFT ARRAY JOIN` clauses. Let's create a table with an [Array](../data_types/array.md) type column and insert values into it:

``` sql
CREATE TABLE arrays_test
(
    s String,
    arr Array(UInt8)
) ENGINE = Memory;

INSERT INTO arrays_test
VALUES ('Hello', [1,2]), ('World', [3,4,5]), ('Goodbye', []);
```
```
┌─s───────────┬─arr─────┐
│ Hello       │ [1,2]   │
│ World       │ [3,4,5] │
│ Goodbye     │ []      │
└─────────────┴─────────┘
```

The example below uses the `ARRAY JOIN` clause:

``` sql
SELECT s, arr
FROM arrays_test
ARRAY JOIN arr;
```
```
┌─s─────┬─arr─┐
│ Hello │   1 │
│ Hello │   2 │
│ World │   3 │
│ World │   4 │
│ World │   5 │
└───────┴─────┘
```

The next example uses the `LEFT ARRAY JOIN` clause:

``` sql
SELECT s, arr
FROM arrays_test
LEFT ARRAY JOIN arr;
```
```
┌─s───────────┬─arr─┐
│ Hello       │   1 │
│ Hello       │   2 │
│ World       │   3 │
│ World       │   4 │
│ World       │   5 │
│ Goodbye     │   0 │
└─────────────┴─────┘
```

#### Using Aliases

An alias can be specified for an array in the `ARRAY JOIN` clause. In this case, an array item can be accessed by this alias, but the array itself is accessed by the original name. Example:

``` sql
SELECT s, arr, a
FROM arrays_test
ARRAY JOIN arr AS a;
```

```
┌─s─────┬─arr─────┬─a─┐
│ Hello │ [1,2]   │ 1 │
│ Hello │ [1,2]   │ 2 │
│ World │ [3,4,5] │ 3 │
│ World │ [3,4,5] │ 4 │
│ World │ [3,4,5] │ 5 │
└───────┴─────────┴───┘
```

Using aliases, you can perform `ARRAY JOIN` with an external array. For example:

``` sql
SELECT s, arr_external
FROM arrays_test
ARRAY JOIN [1, 2, 3] AS arr_external;
```

```
┌─s───────────┬─arr_external─┐
│ Hello       │            1 │
│ Hello       │            2 │
│ Hello       │            3 │
│ World       │            1 │
│ World       │            2 │
│ World       │            3 │
│ Goodbye     │            1 │
│ Goodbye     │            2 │
│ Goodbye     │            3 │
└─────────────┴──────────────┘
```

Multiple arrays can be comma-separated in the `ARRAY JOIN` clause. In this case, `JOIN` is performed with them simultaneously (the direct sum, not the cartesian product). Note that all the arrays must have the same size. Example:

``` sql
SELECT s, arr, a, num, mapped
FROM arrays_test
ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num, arrayMap(x -> x + 1, arr) AS mapped;
```

```
┌─s─────┬─arr─────┬─a─┬─num─┬─mapped─┐
│ Hello │ [1,2]   │ 1 │   1 │      2 │
│ Hello │ [1,2]   │ 2 │   2 │      3 │
│ World │ [3,4,5] │ 3 │   1 │      4 │
│ World │ [3,4,5] │ 4 │   2 │      5 │
│ World │ [3,4,5] │ 5 │   3 │      6 │
└───────┴─────────┴───┴─────┴────────┘
```

The example below uses the [arrayEnumerate](functions/array_functions.md#array_functions-arrayenumerate) function:

``` sql
SELECT s, arr, a, num, arrayEnumerate(arr)
FROM arrays_test
ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num;
```

```
┌─s─────┬─arr─────┬─a─┬─num─┬─arrayEnumerate(arr)─┐
│ Hello │ [1,2]   │ 1 │   1 │ [1,2]               │
│ Hello │ [1,2]   │ 2 │   2 │ [1,2]               │
│ World │ [3,4,5] │ 3 │   1 │ [1,2,3]             │
│ World │ [3,4,5] │ 4 │   2 │ [1,2,3]             │
│ World │ [3,4,5] │ 5 │   3 │ [1,2,3]             │
└───────┴─────────┴───┴─────┴─────────────────────┘
```

#### ARRAY JOIN With Nested Data Structure

`ARRAY `JOIN`` also works with [nested data structures](../data_types/nested_data_structures/nested.md). Example:

``` sql
CREATE TABLE nested_test
(
    s String,
    nest Nested(
    x UInt8,
    y UInt32)
) ENGINE = Memory;

INSERT INTO nested_test
VALUES ('Hello', [1,2], [10,20]), ('World', [3,4,5], [30,40,50]), ('Goodbye', [], []);
```

```
┌─s───────┬─nest.x──┬─nest.y─────┐
│ Hello   │ [1,2]   │ [10,20]    │
│ World   │ [3,4,5] │ [30,40,50] │
│ Goodbye │ []      │ []         │
└─────────┴─────────┴────────────┘
```

``` sql
SELECT s, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN nest;
```

```
┌─s─────┬─nest.x─┬─nest.y─┐
│ Hello │      1 │     10 │
│ Hello │      2 │     20 │
│ World │      3 │     30 │
│ World │      4 │     40 │
│ World │      5 │     50 │
└───────┴────────┴────────┘
```

When specifying names of nested data structures in `ARRAY JOIN`, the meaning is the same as `ARRAY JOIN` with all the array elements that it consists of. Examples are listed below:

``` sql
SELECT s, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN `nest.x`, `nest.y`;
```

```
┌─s─────┬─nest.x─┬─nest.y─┐
│ Hello │      1 │     10 │
│ Hello │      2 │     20 │
│ World │      3 │     30 │
│ World │      4 │     40 │
│ World │      5 │     50 │
└───────┴────────┴────────┘
```

This variation also makes sense:

``` sql
SELECT s, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN `nest.x`;
```

```
┌─s─────┬─nest.x─┬─nest.y─────┐
│ Hello │      1 │ [10,20]    │
│ Hello │      2 │ [10,20]    │
│ World │      3 │ [30,40,50] │
│ World │      4 │ [30,40,50] │
│ World │      5 │ [30,40,50] │
└───────┴────────┴────────────┘
```

An alias may be used for a nested data structure, in order to select either the `JOIN` result or the source array. Example:

``` sql
SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN nest AS n;
```

```
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┐
│ Hello │   1 │  10 │ [1,2]   │ [10,20]    │
│ Hello │   2 │  20 │ [1,2]   │ [10,20]    │
│ World │   3 │  30 │ [3,4,5] │ [30,40,50] │
│ World │   4 │  40 │ [3,4,5] │ [30,40,50] │
│ World │   5 │  50 │ [3,4,5] │ [30,40,50] │
└───────┴─────┴─────┴─────────┴────────────┘
```

Example of using the [arrayEnumerate](functions/array_functions.md#array_functions-arrayenumerate) function:

``` sql
SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`, num
FROM nested_test
ARRAY JOIN nest AS n, arrayEnumerate(`nest.x`) AS num;
```

```
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┬─num─┐
│ Hello │   1 │  10 │ [1,2]   │ [10,20]    │   1 │
│ Hello │   2 │  20 │ [1,2]   │ [10,20]    │   2 │
│ World │   3 │  30 │ [3,4,5] │ [30,40,50] │   1 │
│ World │   4 │  40 │ [3,4,5] │ [30,40,50] │   2 │
│ World │   5 │  50 │ [3,4,5] │ [30,40,50] │   3 │
└───────┴─────┴─────┴─────────┴────────────┴─────┘
```

### JOIN Clause {#select-join}

Joins the data in the normal [SQL JOIN](https://en.wikipedia.org/wiki/Join_(SQL)) sense.

!!! info "Note"
    Not related to [ARRAY JOIN](#select-array-join-clause).


``` sql
SELECT <expr_list>
FROM <left_subquery>
[GLOBAL] [ANY|ALL] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER] JOIN <right_subquery>
(ON <expr_list>)|(USING <column_list>) ...
```

The table names can be specified instead of `<left_subquery>` and `<right_subquery>`. This is equivalent to the `SELECT * FROM table` subquery, except in a special case when the table has the [Join](../operations/table_engines/join.md) engine – an array prepared for joining.

#### Supported Types of `JOIN`

- `INNER JOIN` (or `JOIN`)
- `LEFT JOIN` (or `LEFT OUTER JOIN`)
- `RIGHT JOIN` (or `RIGHT OUTER JOIN`)
- `FULL JOIN` (or `FULL OUTER JOIN`)
- `CROSS JOIN` (or `,` )

See the standard [SQL JOIN](https://en.wikipedia.org/wiki/Join_(SQL)) description.

#### Multiple JOIN

Performing queries, ClickHouse rewrites multiple joins into the combination of two-table joins and processes them sequentially. If there are four tables for join ClickHouse joins the first and the second, then joins the result with the third table, and at the last step, it joins the fourth one.

If a query contains `WHERE` clause, ClickHouse tries to push down filters from this clause into the intermediate join. If it cannot apply the filter to each intermediate join, ClickHouse applies the filters after all joins are completed.

We recommend the `JOIN ON` or `JOIN USING` syntax for creating a query. For example:

```
SELECT * FROM t1 JOIN t2 ON t1.a = t2.a JOIN t3 ON t1.a = t3.a
```

Also, you can use comma separated list of tables for join. Works only with the [allow_experimental_cross_to_join_conversion = 1](../operations/settings/settings.md#settings-allow_experimental_cross_to_join_conversion) setting.

    For example, `SELECT * FROM t1, t2, t3 WHERE t1.a = t2.a AND t1.a = t3.a`

Don't mix these syntaxes.

ClickHouse doesn't support the syntax with commas directly, so we don't recommend to use it. The algorithm tries to rewrite the query in terms of `CROSS` and `INNER` `JOIN` clauses and then proceeds the query processing. When rewriting the query, ClickHouse tries to optimize performance and memory consumption. By default, ClickHouse treats comma as an `INNER JOIN` clause and converts it to `CROSS JOIN` when the algorithm cannot guaranty that `INNER JOIN` returns required data.

#### ANY or ALL Strictness

If `ALL` is specified and the right table has several matching rows, the data will be multiplied by the number of these rows. This is the normal `JOIN` behavior for standard SQL.
If `ANY` is specified and the right table has several matching rows, only the first one found is joined. If the right table has only one matching row, the results of `ANY` and `ALL` are the same.

To set the default strictness value, use the session configuration parameter [join_default_strictness](../operations/settings/settings.md#settings-join_default_strictness).

#### GLOBAL JOIN

When using a normal `JOIN`, the query is sent to remote servers. Subqueries are run on each of them in order to make the right table, and the join is performed with this table. In other words, the right table is formed on each server separately.

When using `GLOBAL ... JOIN`, first the requestor server runs a subquery to calculate the right table. This temporary table is passed to each remote server, and queries are run on them using the temporary data that was transmitted.

Be careful when using `GLOBAL`. For more information, see the section [Distributed subqueries](#select-distributed-subqueries).

#### Usage Recommendations

All columns that are not needed for the `JOIN` are deleted from the subquery.

When running a `JOIN`, there is no optimization of the order of execution in relation to other stages of the query. The join (a search in the right table) is run before filtering in `WHERE` and before aggregation. In order to explicitly set the processing order, we recommend running a `JOIN` subquery with a subquery.

Example:

``` sql
SELECT
    CounterID,
    hits,
    visits
FROM
(
    SELECT
        CounterID,
        count() AS hits
    FROM test.hits
    GROUP BY CounterID
) ANY LEFT JOIN
(
    SELECT
        CounterID,
        sum(Sign) AS visits
    FROM test.visits
    GROUP BY CounterID
) USING CounterID
ORDER BY hits DESC
LIMIT 10
```

```
┌─CounterID─┬───hits─┬─visits─┐
│   1143050 │ 523264 │  13665 │
│    731962 │ 475698 │ 102716 │
│    722545 │ 337212 │ 108187 │
│    722889 │ 252197 │  10547 │
│   2237260 │ 196036 │   9522 │
│  23057320 │ 147211 │   7689 │
│    722818 │  90109 │  17847 │
│     48221 │  85379 │   4652 │
│  19762435 │  77807 │   7026 │
│    722884 │  77492 │  11056 │
└───────────┴────────┴────────┘
```

Subqueries don't allow you to set names or use them for referencing a column from a specific subquery.
The columns specified in `USING` must have the same names in both subqueries, and the other columns must be named differently. You can use aliases to change the names of columns in subqueries (the example uses the aliases 'hits' and 'visits').

The `USING` clause specifies one or more columns to join, which establishes the equality of these columns. The list of columns is set without brackets. More complex join conditions are not supported.

The right table (the subquery result) resides in RAM. If there isn't enough memory, you can't run a `JOIN`.

Each time a query is run with the same `JOIN`, the subquery is run again because the result is not cached. To avoid this, use the special [Join](../operations/table_engines/join.md) table engine, which is a prepared array for joining that is always in RAM.

In some cases, it is more efficient to use `IN` instead of `JOIN`.
Among the various types of `JOIN`, the most efficient is `ANY LEFT JOIN`, then `ANY INNER JOIN`. The least efficient are `ALL LEFT JOIN` and `ALL INNER JOIN`.

If you need a `JOIN` for joining with dimension tables (these are relatively small tables that contain dimension properties, such as names for advertising campaigns), a `JOIN` might not be very convenient due to the bulky syntax and the fact that the right table is re-accessed for every query. For such cases, there is an "external dictionaries" feature that you should use instead of `JOIN`. For more information, see the section [External dictionaries](dicts/external_dicts.md).

#### Processing of Empty or NULL Cells

While joining tables, the empty cells may appear. The setting [join_use_nulls](../operations/settings/settings.md#settings-join_use_nulls) define how ClickHouse fills these cells.

If the `JOIN` keys are [Nullable](../data_types/nullable.md) fields, the rows where at least one of the keys has the value [NULL](syntax.md#null-literal) are not joined.


### WHERE Clause

If there is a WHERE clause, it must contain an expression with the UInt8 type. This is usually an expression with comparison and logical operators.
This expression will be used for filtering data before all other transformations.

If indexes are supported by the database table engine, the expression is evaluated on the ability to use indexes.


### PREWHERE Clause

This clause has the same meaning as the WHERE clause. The difference is in which data is read from the table.
When using PREWHERE, first only the columns necessary for executing PREWHERE are read. Then the other columns are read that are needed for running the query, but only those blocks where the PREWHERE expression is true.

It makes sense to use PREWHERE if there are filtration conditions that are used by a minority of the columns in the query, but that provide strong data filtration. This reduces the volume of data to read.

For example, it is useful to write PREWHERE for queries that extract a large number of columns, but that only have filtration for a few columns.

PREWHERE is only supported by tables from the `*MergeTree` family.

A query may simultaneously specify PREWHERE and WHERE. In this case, PREWHERE precedes WHERE.

If the 'optimize_move_to_prewhere' setting is set to 1 and PREWHERE is omitted, the system uses heuristics to automatically move parts of expressions from WHERE to PREWHERE.

### GROUP BY Clause {#select-group-by-clause}

This is one of the most important parts of a column-oriented DBMS.

If there is a GROUP BY clause, it must contain a list of expressions. Each expression will be referred to here as a "key".
All the expressions in the SELECT, HAVING, and ORDER BY clauses must be calculated from keys or from aggregate functions. In other words, each column selected from the table must be used either in keys or inside aggregate functions.

If a query contains only table columns inside aggregate functions, the GROUP BY clause can be omitted, and aggregation by an empty set of keys is assumed.

Example:

``` sql
SELECT
    count(),
    median(FetchTiming > 60 ? 60 : FetchTiming),
    count() - sum(Refresh)
FROM hits
```

However, in contrast to standard SQL, if the table doesn't have any rows (either there aren't any at all, or there aren't any after using WHERE to filter), an empty result is returned, and not the result from one of the rows containing the initial values of aggregate functions.

As opposed to MySQL (and conforming to standard SQL), you can't get some value of some column that is not in a key or aggregate function (except constant expressions). To work around this, you can use the 'any' aggregate function (get the first encountered value) or 'min/max'.

Example:

``` sql
SELECT
    domainWithoutWWW(URL) AS domain,
    count(),
    any(Title) AS title -- getting the first occurred page header for each domain.
FROM hits
GROUP BY domain
```

For every different key value encountered, GROUP BY calculates a set of aggregate function values.

GROUP BY is not supported for array columns.

A constant can't be specified as arguments for aggregate functions. Example: sum(1). Instead of this, you can get rid of the constant. Example: `count()`.

#### NULL processing

For grouping, ClickHouse interprets [NULL](syntax.md) as a value, and `NULL=NULL`.

Here's an example to show what this means.

Assume you have this table:

```
┌─x─┬────y─┐
│ 1 │    2 │
│ 2 │ ᴺᵁᴸᴸ │
│ 3 │    2 │
│ 3 │    3 │
│ 3 │ ᴺᵁᴸᴸ │
└───┴──────┘
```

The query `SELECT sum(x), y FROM t_null_big GROUP BY y` results in:

```
┌─sum(x)─┬────y─┐
│      4 │    2 │
│      3 │    3 │
│      5 │ ᴺᵁᴸᴸ │
└────────┴──────┘
```

You can see that `GROUP BY` for `У = NULL` summed up `x`, as if `NULL` is this value.

If you pass several keys to `GROUP BY`, the result will give you all the combinations of the selection, as if `NULL` were a specific value.

#### WITH TOTALS Modifier

If the WITH TOTALS modifier is specified, another row will be calculated. This row will have key columns containing default values (zeros or empty lines), and columns of aggregate functions with the values calculated across all the rows (the "total" values).

This extra row is output in JSON\*, TabSeparated\*, and Pretty\* formats, separately from the other rows. In the other formats, this row is not output.

In JSON\* formats, this row is output as a separate 'totals' field. In TabSeparated\* formats, the row comes after the main result, preceded by an empty row (after the other data). In Pretty\* formats, the row is output as a separate table after the main result.

`WITH TOTALS` can be run in different ways when HAVING is present. The behavior depends on the 'totals_mode' setting.
By default, `totals_mode = 'before_having'`. In this case, 'totals' is calculated across all rows, including the ones that don't pass through HAVING and 'max_rows_to_group_by'.

The other alternatives include only the rows that pass through HAVING in 'totals', and behave differently with the setting `max_rows_to_group_by` and `group_by_overflow_mode = 'any'`.

`after_having_exclusive` – Don't include rows that didn't pass through `max_rows_to_group_by`. In other words, 'totals' will have less than or the same number of rows as it would if `max_rows_to_group_by` were omitted.

`after_having_inclusive` – Include all the rows that didn't pass through 'max_rows_to_group_by' in 'totals'. In other words, 'totals' will have more than or the same number of rows as it would if `max_rows_to_group_by` were omitted.

`after_having_auto` – Count the number of rows that passed through HAVING. If it is more than a certain amount (by default, 50%), include all the rows that didn't pass through 'max_rows_to_group_by' in 'totals'. Otherwise, do not include them.

`totals_auto_threshold` – By default, 0.5. The coefficient for `after_having_auto`.

If `max_rows_to_group_by` and `group_by_overflow_mode = 'any'` are not used, all variations of `after_having` are the same, and you can use any of them (for example, `after_having_auto`).

You can use WITH TOTALS in subqueries, including subqueries in the JOIN clause (in this case, the respective total values are combined).

#### GROUP BY in External Memory {#select-group-by-in-external-memory}

You can enable dumping temporary data to the disk to restrict memory usage during GROUP BY.
The `max_bytes_before_external_group_by` setting determines the threshold RAM consumption for dumping GROUP BY temporary data to the file system. If set to 0 (the default), it is disabled.

When using `max_bytes_before_external_group_by`, we recommend that you set max_memory_usage about twice as high. This is necessary because there are two stages to aggregation: reading the date and forming intermediate data (1) and merging the intermediate data (2). Dumping data to the file system can only occur during stage 1. If the temporary data wasn't dumped, then stage 2 might require up to the same amount of memory as in stage 1.

For example, if `max_memory_usage` was set to 10000000000 and you want to use external aggregation, it makes sense to set `max_bytes_before_external_group_by` to 10000000000, and max_memory_usage to 20000000000. When external aggregation is triggered (if there was at least one dump of temporary data), maximum consumption of RAM is only slightly more than ` max_bytes_before_external_group_by`.

With distributed query processing, external aggregation is performed on remote servers. In order for the requestor server to use only a small amount of RAM, set ` distributed_aggregation_memory_efficient` to 1.

When merging data flushed to the disk, as well as when merging results from remote servers when the ` distributed_aggregation_memory_efficient` setting is enabled, consumes up to 1/256 \* the number of threads from the total amount of RAM.

When external aggregation is enabled, if there was less than ` max_bytes_before_external_group_by` of data (i.e. data was not flushed), the query runs just as fast as without external aggregation. If any temporary data was flushed, the run time will be several times longer (approximately three times).

If you have an ORDER BY with a small LIMIT after GROUP BY, then the ORDER BY CLAUSE will not use significant amounts of RAM.
But if the ORDER BY doesn't have LIMIT, don't forget to enable external sorting (`max_bytes_before_external_sort`).

### LIMIT BY Clause

The query with the `LIMIT n BY expressions` clause selects the first `n` rows for each distinct value of `expressions`. The key for `LIMIT BY` can contain any number of [expressions](syntax.md#syntax-expressions).

ClickHouse supports the following syntax:

- `LIMIT [offset_value, ]n BY expressions`
- `LIMIT n OFFSET offset_value BY expressions`

During the query processing, ClickHouse selects data ordered by sorting key. Sorting key is set explicitly by [ORDER BY](#select-order-by) clause or implicitly as a property of table engine. Then ClickHouse applies `LIMIT n BY expressions` and returns the first `n` rows for each distinct combination of `expressions`. If `OFFSET` is specified, then for each data block, belonging to a distinct combination of `expressions`, ClickHouse skips `offset_value` rows from the beginning of the block, and returns not more than `n` rows as a result. If `offset_value` is bigger than the number of rows in the data block, then ClickHouse returns no rows from the block.

`LIMIT BY` is not related to `LIMIT`, they can both be used in the same query.

**Examples**

Sample table:

```sql
CREATE TABLE limit_by(id Int, val Int) ENGINE = Memory;
INSERT INTO limit_by values(1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
```

Queries:

```sql
SELECT * FROM limit_by ORDER BY id, val LIMIT 2 BY id
```
```text
┌─id─┬─val─┐
│  1 │  10 │
│  1 │  11 │
│  2 │  20 │
│  2 │  21 │
└────┴─────┘
```
```sql
SELECT * FROM limit_by ORDER BY id, val LIMIT 1, 2 BY id
```
```text
┌─id─┬─val─┐
│  1 │  11 │
│  1 │  12 │
│  2 │  21 │
└────┴─────┘
```

The `SELECT * FROM limit_by ORDER BY id, val LIMIT 2 OFFSET 1 BY id` query returns the same result.

The following query returns the top 5 referrers for each `domain, device_type` pair, but not more than 100 rows (`LIMIT n BY + LIMIT`).

``` sql
SELECT
    domainWithoutWWW(URL) AS domain,
    domainWithoutWWW(REFERRER_URL) AS referrer,
    device_type,
    count() cnt
FROM hits
GROUP BY domain, referrer, device_type
ORDER BY cnt DESC
LIMIT 5 BY domain, device_type
LIMIT 100
```

### HAVING Clause

Allows filtering the result received after GROUP BY, similar to the WHERE clause.
WHERE and HAVING differ in that WHERE is performed before aggregation (GROUP BY), while HAVING is performed after it.
If aggregation is not performed, HAVING can't be used.


### ORDER BY Clause {#select-order-by}

The ORDER BY clause contains a list of expressions, which can each be assigned DESC or ASC (the sorting direction). If the direction is not specified, ASC is assumed. ASC is sorted in ascending order, and DESC in descending order. The sorting direction applies to a single expression, not to the entire list. Example: `ORDER BY Visits DESC, SearchPhrase`

For sorting by String values, you can specify collation (comparison). Example: `ORDER BY SearchPhrase COLLATE 'tr'` - for sorting by keyword in ascending order, using the Turkish alphabet, case insensitive, assuming that strings are UTF-8 encoded. COLLATE can be specified or not for each expression in ORDER BY independently. If ASC or DESC is specified, COLLATE is specified after it. When using COLLATE, sorting is always case-insensitive.

We only recommend using COLLATE for final sorting of a small number of rows, since sorting with COLLATE is less efficient than normal sorting by bytes.

Rows that have identical values for the list of sorting expressions are output in an arbitrary order, which can also be nondeterministic (different each time).
If the ORDER BY clause is omitted, the order of the rows is also undefined, and may be nondeterministic as well.

`NaN` and `NULL` sorting order:

- With the modifier `NULLS FIRST` — First `NULL`, then `NaN`, then other values.
- With the modifier `NULLS LAST` — First the values, then `NaN`, then `NULL`.
- Default — The same as with the `NULLS LAST` modifier.

Example:

For the table

```
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │    2 │
│ 1 │  nan │
│ 2 │    2 │
│ 3 │    4 │
│ 5 │    6 │
│ 6 │  nan │
│ 7 │ ᴺᵁᴸᴸ │
│ 6 │    7 │
│ 8 │    9 │
└───┴──────┘
```

Run the query `SELECT * FROM t_null_nan ORDER BY y NULLS FIRST` to get:

```
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 7 │ ᴺᵁᴸᴸ │
│ 1 │  nan │
│ 6 │  nan │
│ 2 │    2 │
│ 2 │    2 │
│ 3 │    4 │
│ 5 │    6 │
│ 6 │    7 │
│ 8 │    9 │
└───┴──────┘
```

When floating point numbers are sorted, NaNs are separate from the other values. Regardless of the sorting order, NaNs come at the end. In other words, for ascending sorting they are placed as if they are larger than all the other numbers, while for descending sorting they are placed as if they are smaller than the rest.

Less RAM is used if a small enough LIMIT is specified in addition to ORDER BY. Otherwise, the amount of memory spent is proportional to the volume of data for sorting. For distributed query processing, if GROUP BY is omitted, sorting is partially done on remote servers, and the results are merged on the requestor server. This means that for distributed sorting, the volume of data to sort can be greater than the amount of memory on a single server.

If there is not enough RAM, it is possible to perform sorting in external memory (creating temporary files on a disk). Use the setting `max_bytes_before_external_sort` for this purpose. If it is set to 0 (the default), external sorting is disabled. If it is enabled, when the volume of data to sort reaches the specified number of bytes, the collected data is sorted and dumped into a temporary file. After all data is read, all the sorted files are merged and the results are output. Files are written to the /var/lib/clickhouse/tmp/ directory in the config (by default, but you can use the 'tmp_path' parameter to change this setting).

Running a query may use more memory than 'max_bytes_before_external_sort'. For this reason, this setting must have a value significantly smaller than 'max_memory_usage'. As an example, if your server has 128 GB of RAM and you need to run a single query, set 'max_memory_usage' to 100 GB, and 'max_bytes_before_external_sort' to 80 GB.

External sorting works much less effectively than sorting in RAM.

### SELECT Clause

The expressions specified in the SELECT clause are analyzed after the calculations for all the clauses listed above are completed.
More specifically, expressions are analyzed that are above the aggregate functions, if there are any aggregate functions.
The aggregate functions and everything below them are calculated during aggregation (GROUP BY).
These expressions work as if they are applied to separate rows in the result.

### DISTINCT Clause {#select-distinct}

If DISTINCT is specified, only a single row will remain out of all the sets of fully matching rows in the result.
The result will be the same as if GROUP BY were specified across all the fields specified in SELECT without aggregate functions. But there are several differences from GROUP BY:

- DISTINCT can be applied together with GROUP BY.
- When ORDER BY is omitted and LIMIT is defined, the query stops running immediately after the required number of different rows has been read.
- Data blocks are output as they are processed, without waiting for the entire query to finish running.

DISTINCT is not supported if SELECT has at least one array column.

`DISTINCT` works with [NULL](syntax.md) as if `NULL` were a specific value, and `NULL=NULL`. In other words, in the `DISTINCT` results, different combinations with `NULL` only occur once.

### LIMIT Clause

`LIMIT m` allows you to select the first `m` rows from the result.

`LIMIT n, m` allows you to select the first `m` rows from the result after skipping the first `n` rows. The `LIMIT m OFFSET n` syntax is also supported.

`n` and `m` must be non-negative integers.

If there isn't an `ORDER BY` clause that explicitly sorts results, the result may be arbitrary and nondeterministic.

### UNION ALL Clause

You can use UNION ALL to combine any number of queries. Example:

``` sql
SELECT CounterID, 1 AS table, toInt64(count()) AS c
    FROM test.hits
    GROUP BY CounterID

UNION ALL

SELECT CounterID, 2 AS table, sum(Sign) AS c
    FROM test.visits
    GROUP BY CounterID
    HAVING c > 0
```

Only UNION ALL is supported. The regular UNION (UNION DISTINCT) is not supported. If you need UNION DISTINCT, you can write SELECT DISTINCT from a subquery containing UNION ALL.

Queries that are parts of UNION ALL can be run simultaneously, and their results can be mixed together.

The structure of results (the number and type of columns) must match for the queries. But the column names can differ. In this case, the column names for the final result will be taken from the first query. Type casting is performed for unions. For example, if two queries being combined have the same field with non-`Nullable` and `Nullable` types from a compatible type, the resulting `UNION ALL` has a `Nullable` type field.

Queries that are parts of UNION ALL can't be enclosed in brackets. ORDER BY and LIMIT are applied to separate queries, not to the final result. If you need to apply a conversion to the final result, you can put all the queries with UNION ALL in a subquery in the FROM clause.

### INTO OUTFILE Clause

Add the `INTO OUTFILE filename` clause (where filename is a string literal) to redirect query output to the specified file.
In contrast to MySQL, the file is created on the client side. The query will fail if a file with the same filename already exists.
This functionality is available in the command-line client and clickhouse-local (a query sent via HTTP interface will fail).

The default output format is TabSeparated (the same as in the command-line client batch mode).

### FORMAT Clause

Specify 'FORMAT format' to get data in any specified format.
You can use this for convenience, or for creating dumps.
For more information, see the section "Formats".
If the FORMAT clause is omitted, the default format is used, which depends on both the settings and the interface used for accessing the DB. For the HTTP interface and the command-line client in batch mode, the default format is TabSeparated. For the command-line client in interactive mode, the default format is PrettyCompact (it has attractive and compact tables).

When using the command-line client, data is passed to the client in an internal efficient format. The client independently interprets the FORMAT clause of the query and formats the data itself (thus relieving the network and the server from the load).


### IN Operators {#select-in-operators}

The `IN`, `NOT IN`, `GLOBAL IN`, and `GLOBAL NOT IN` operators are covered separately, since their functionality is quite rich.

The left side of the operator is either a single column or a tuple.

Examples:

``` sql
SELECT UserID IN (123, 456) FROM ...
SELECT (CounterID, UserID) IN ((34, 123), (101500, 456)) FROM ...
```

If the left side is a single column that is in the index, and the right side is a set of constants, the system uses the index for processing the query.

Don't list too many values explicitly (i.e. millions). If a data set is large, put it in a temporary table (for example, see the section "External data for query processing"), then use a subquery.

The right side of the operator can be a set of constant expressions, a set of tuples with constant expressions (shown in the examples above), or the name of a database table or SELECT subquery in brackets.

If the right side of the operator is the name of a table (for example, `UserID IN users`), this is equivalent to the subquery `UserID IN (SELECT * FROM users)`. Use this when working with external data that is sent along with the query. For example, the query can be sent together with a set of user IDs loaded to the 'users' temporary table, which should be filtered.

If the right side of the operator is a table name that has the Set engine (a prepared data set that is always in RAM), the data set will not be created over again for each query.

The subquery may specify more than one column for filtering tuples.
Example:

``` sql
SELECT (CounterID, UserID) IN (SELECT CounterID, UserID FROM ...) FROM ...
```

The columns to the left and right of the IN operator should have the same type.

The IN operator and subquery may occur in any part of the query, including in aggregate functions and lambda functions.
Example:

``` sql
SELECT
    EventDate,
    avg(UserID IN
    (
        SELECT UserID
        FROM test.hits
        WHERE EventDate = toDate('2014-03-17')
    )) AS ratio
FROM test.hits
GROUP BY EventDate
ORDER BY EventDate ASC
```

```
┌──EventDate─┬────ratio─┐
│ 2014-03-17 │        1 │
│ 2014-03-18 │ 0.807696 │
│ 2014-03-19 │ 0.755406 │
│ 2014-03-20 │ 0.723218 │
│ 2014-03-21 │ 0.697021 │
│ 2014-03-22 │ 0.647851 │
│ 2014-03-23 │ 0.648416 │
└────────────┴──────────┘
```

For each day after March 17th, count the percentage of pageviews made by users who visited the site on March 17th.
A subquery in the IN clause is always run just one time on a single server. There are no dependent subqueries.

#### NULL processing

During request processing, the IN operator assumes that the result of an operation with [NULL](syntax.md) is always equal to `0`, regardless of whether `NULL` is on the right or left side of the operator. `NULL` values are not included in any dataset, do not correspond to each other and cannot be compared.

Here is an example with the `t_null` table:

```
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │    3 │
└───┴──────┘
```

Running the query `SELECT x FROM t_null WHERE y IN (NULL,3)` gives you the following result:

```
┌─x─┐
│ 2 │
└───┘
```

You can see that the row in which `y = NULL` is thrown out of the query results. This is because ClickHouse can't decide whether `NULL` is included in the `(NULL,3)` set, returns `0` as the result of the operation, and `SELECT` excludes this row from the final output.

```
SELECT y IN (NULL, 3)
FROM t_null

┌─in(y, tuple(NULL, 3))─┐
│                     0 │
│                     1 │
└───────────────────────┘
```


#### Distributed Subqueries {#select-distributed-subqueries}

There are two options for IN-s with subqueries (similar to JOINs): normal `IN` / `JOIN` and `GLOBAL IN` / `GLOBAL JOIN`. They differ in how they are run for distributed query processing.

!!! attention
    Remember that the algorithms described below may work differently depending on the [settings](../operations/settings/settings.md) `distributed_product_mode` setting.

When using the regular IN, the query is sent to remote servers, and each of them runs the subqueries in the `IN` or `JOIN` clause.

When using `GLOBAL IN` / `GLOBAL JOINs`, first all the subqueries are run for `GLOBAL IN` / `GLOBAL JOINs`, and the results are collected in temporary tables. Then the temporary tables are sent to each remote server, where the queries are run using this temporary data.

For a non-distributed query, use the regular `IN` / `JOIN`.

Be careful when using subqueries in the `IN` / `JOIN` clauses for distributed query processing.

Let's look at some examples. Assume that each server in the cluster has a normal **local_table**. Each server also has a **distributed_table** table with the **Distributed** type, which looks at all the servers in the cluster.

For a query to the **distributed_table**, the query will be sent to all the remote servers and run on them using the **local_table**.

For example, the query

``` sql
SELECT uniq(UserID) FROM distributed_table
```

will be sent to all remote servers as

``` sql
SELECT uniq(UserID) FROM local_table
```

and run on each of them in parallel, until it reaches the stage where intermediate results can be combined. Then the intermediate results will be returned to the requestor server and merged on it, and the final result will be sent to the client.

Now let's examine a query with IN:

``` sql
SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM local_table WHERE CounterID = 34)
```

- Calculation of the intersection of audiences of two sites.

This query will be sent to all remote servers as

``` sql
SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM local_table WHERE CounterID = 34)
```

In other words, the data set in the IN clause will be collected on each server independently, only across the data that is stored locally on each of the servers.

This will work correctly and optimally if you are prepared for this case and have spread data across the cluster servers such that the data for a single UserID resides entirely on a single server. In this case, all the necessary data will be available locally on each server. Otherwise, the result will be inaccurate. We refer to this variation of the query as "local IN".

To correct how the query works when data is spread randomly across the cluster servers, you could specify **distributed_table** inside a subquery. The query would look like this:

``` sql
SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
```

This query will be sent to all remote servers as

``` sql
SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
```

The subquery will begin running on each remote server. Since the subquery uses a distributed table, the subquery that is on each remote server will be resent to every remote server as

``` sql
SELECT UserID FROM local_table WHERE CounterID = 34
```

For example, if you have a cluster of 100 servers, executing the entire query will require 10,000 elementary requests, which is generally considered unacceptable.

In such cases, you should always use GLOBAL IN instead of IN. Let's look at how it works for the query

``` sql
SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID GLOBAL IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
```

The requestor server will run the subquery

``` sql
SELECT UserID FROM distributed_table WHERE CounterID = 34
```

and the result will be put in a temporary table in RAM. Then the request will be sent to each remote server as

``` sql
SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID GLOBAL IN _data1
```

and the temporary table `_data1` will be sent to every remote server with the query (the name of the temporary table is implementation-defined).

This is more optimal than using the normal IN. However, keep the following points in mind:

1. When creating a temporary table, data is not made unique. To reduce the volume of data transmitted over the network, specify DISTINCT in the subquery. (You don't need to do this for a normal IN.)
2. The temporary table will be sent to all the remote servers. Transmission does not account for network topology. For example, if 10 remote servers reside in a datacenter that is very remote in relation to the requestor server, the data will be sent 10 times over the channel to the remote datacenter. Try to avoid large data sets when using GLOBAL IN.
3. When transmitting data to remote servers, restrictions on network bandwidth are not configurable. You might overload the network.
4. Try to distribute data across servers so that you don't need to use GLOBAL IN on a regular basis.
5. If you need to use GLOBAL IN often, plan the location of the ClickHouse cluster so that a single group of replicas resides in no more than one data center with a fast network between them, so that a query can be processed entirely within a single data center.

It also makes sense to specify a local table in the `GLOBAL IN` clause, in case this local table is only available on the requestor server and you want to use data from it on remote servers.

### Extreme Values

In addition to results, you can also get minimum and maximum values for the results columns. To do this, set the **extremes** setting to 1. Minimums and maximums are calculated for numeric types, dates, and dates with times. For other columns, the default values are output.

An extra two rows are calculated – the minimums and maximums, respectively. These extra two rows are output in JSON\*, TabSeparated\*, and Pretty\* formats, separate from the other rows. They are not output for other formats.

In JSON\* formats, the extreme values are output in a separate 'extremes' field. In TabSeparated\* formats, the row comes after the main result, and after 'totals' if present. It is preceded by an empty row (after the other data). In Pretty\* formats, the row is output as a separate table after the main result, and after 'totals' if present.

Extreme values are calculated for rows that have passed through LIMIT. However, when using 'LIMIT offset, size', the rows before 'offset' are included in 'extremes'. In stream requests, the result may also include a small number of rows that passed through LIMIT.

### Notes

The `GROUP BY` and `ORDER BY` clauses do not support positional arguments. This contradicts MySQL, but conforms to standard SQL.
For example, `GROUP BY 1, 2` will be interpreted as grouping by constants (i.e. aggregation of all rows into one).

You can use synonyms (`AS` aliases) in any part of a query.

You can put an asterisk in any part of a query instead of an expression. When the query is analyzed, the asterisk is expanded to a list of all table columns (excluding the `MATERIALIZED` and `ALIAS` columns). There are only a few cases when using an asterisk is justified:

- When creating a table dump.
- For tables containing just a few columns, such as system tables.
- For getting information about what columns are in a table. In this case, set `LIMIT 1`. But it is better to use the `DESC TABLE` query.
- When there is strong filtration on a small number of columns using `PREWHERE`.
- In subqueries (since columns that aren't needed for the external query are excluded from subqueries).

In all other cases, we don't recommend using the asterisk, since it only gives you the drawbacks of a columnar DBMS instead of the advantages. In other words using the asterisk is not recommended.

[Original article](https://clickhouse.yandex/docs/en/query_language/select/) <!--hide-->
-												Capitalize most words in titles of docs/en/

											
										
										
											2018-08-07 16:49:40 +00:00
+								# SELECT Queries Syntax
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								`SELECT` performs data retrieval.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT [DISTINCT] expr_list
 								    [FROM [db.]table | (subquery) | table_function] [FINAL]
 								    [SAMPLE sample_coeff]
 								    [ARRAY JOIN ...]
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								    [GLOBAL] [ANY|ALL] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER] JOIN (subquery)|table USING columns_list
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								    [PREWHERE expr]
 								    [WHERE expr]
 								    [GROUP BY expr_list] [WITH TOTALS]
 								    [HAVING expr]
 								    [ORDER BY expr_list]
 								    [LIMIT [n, ]m]
 								    [UNION ALL ...]
 								    [INTO OUTFILE filename]
 								    [FORMAT format]
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								    [LIMIT [offset_value, ]n BY columns]
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								```
 								All the clauses are optional, except for the required list of expressions immediately after SELECT.
 								The clauses below are described in almost the same order as in the query execution conveyor.
 								If the query omits the `DISTINCT`, `GROUP BY` and `ORDER BY` clauses and the `IN` and `JOIN` subqueries, the query will be completely stream processed, using O(1) amount of RAM.
 								Otherwise, the query might consume a lot of RAM if the appropriate restrictions are not specified: `max_memory_usage`, `max_rows_to_group_by`, `max_rows_to_sort`, `max_rows_in_distinct`, `max_bytes_in_distinct`, `max_rows_in_set`, `max_bytes_in_set`, `max_rows_in_join`, `max_bytes_in_join`, `max_bytes_before_external_sort`, `max_bytes_before_external_group_by`. For more information, see the section "Settings". It is possible to use external sorting (saving temporary tables to a disk) and external aggregation. `The system does not have "merge join"`.
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								### FROM Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								If the FROM clause is omitted, data will be read from the `system.one` table.
 								The 'system.one' table contains exactly one row (this table fulfills the same purpose as the DUAL table found in other DBMSs).
 								The FROM clause specifies the table to read data from, or a subquery, or a table function; ARRAY JOIN and the regular JOIN may also be included (see below).
 								Instead of a table, the SELECT subquery may be specified in brackets.
 								In this case, the subquery processing pipeline will be built into the processing pipeline of an external query.
 								In contrast to standard SQL, a synonym does not need to be specified after a subquery. For compatibility, it is possible to write 'AS name' after a subquery, but the specified name isn't used anywhere.
 								A table function may be specified instead of a table. For more information, see the section "Table functions".
 								To execute a query, all the columns listed in the query are extracted from the appropriate table. Any columns not needed for the external query are thrown out of the subqueries.
 								If a query does not list any columns (for example, SELECT count() FROM t), some column is extracted from the table anyway (the smallest one is preferred), in order to calculate the number of rows.
 								The FINAL modifier can be used only for a SELECT from a CollapsingMergeTree table. When you specify FINAL, data is selected fully "collapsed". Keep in mind that using FINAL leads to a selection that includes columns related to the primary key, in addition to the columns specified in the SELECT. Additionally, the query will be executed in a single stream, and data will be merged during query execution. This means that when using FINAL, the query is processed more slowly. In most cases, you should avoid using FINAL. For more information, see the section "CollapsingMergeTree engine".
-												Doc fixes: remove all anchors <a> (#3897)

* Doc fixes: rm anchors <a>

* Doc fixes: rm anchors <a>

* Doc fixes: fix links

* Doc fixes: fix the links

											
										
										
											2018-12-21 19:23:55 +00:00
+								### SAMPLE Clause {#select-sample-clause}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								The `SAMPLE` clause allows for approximated query processing.
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
 								When data sampling is enabled, the query is not performed on all the data, but only on a certain fraction of data (sample). For example, if you need to calculate statistics for all the visits, it is enough to execute the query on the 1/10 fraction of all the visits and then multiply the result by 10.
 								Approximated query processing can be useful in the following cases:
 								- When you have strict timing requirements (like <100ms) but you can't justify the cost of additional hardware resources to meet them.
 								- When your raw data is not accurate, so approximation doesn't noticeably degrade the quality.
 								- Business requirements target approximate results (for cost-effectiveness, or in order to market exact results to premium users).
 								!!! note
 								    You can only use sampling with the tables in the [MergeTree](../operations/table_engines/mergetree.md) family, and only if the sampling expression was specified during table creation (see [MergeTree engine](../operations/table_engines/mergetree.md#table_engine-mergetree-creating-a-table)).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Fix SAMPLE desc

											
										
										
											2019-02-19 13:29:49 +00:00
+								The features of data sampling are listed below:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit some sentences

											
										
										
											2019-02-25 12:54:30 +00:00
+								- Data sampling is a deterministic mechanism. The result of the same `SELECT .. SAMPLE` query is always the same.
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								- Sampling works consistently for different tables. For tables with a single sampling key, a sample with the same coefficient always selects the same subset of possible data. For example, a sample of user IDs takes rows with the same subset of all the possible user IDs from different tables. This means that you can use the sample in subqueries in the [IN](#select-in-operators) clause. Also, you can join samples using the [JOIN](#select-join) clause.
-												Doc fix: Update info about system.columns + system.tables (en) (#4838)


											
										
										
											2019-03-29 14:44:12 +00:00
+								- Sampling allows reading less data from a disk. Note that you must specify the sampling key correctly. For more information, see [Creating a MergeTree Table](../operations/table_engines/mergetree.md#table_engine-mergetree-creating-a-table).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Update info about system.columns + system.tables (en) (#4838)


											
										
										
											2019-03-29 14:44:12 +00:00
+								For the `SAMPLE` clause the following syntax is supported:
-												Fix SAMPLE desc

											
										
										
											2019-02-19 13:29:49 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								| SAMPLE&#160;Clause&#160;Syntax | Description |
 								| ---------------- | --------- |
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								| `SAMPLE k` | Here `k` is the number from 0 to 1.</br>The query is executed on `k` fraction of data. For example, `SAMPLE 0.1` runs the query on 10% of data. [Read more](#select-sample-k)|
 								| `SAMPLE n` | Here `n` is a sufficiently large integer.</br>The query is executed on a sample of at least `n` rows (but not significantly more than this). For example, `SAMPLE 10000000` runs the query on a minimum of 10,000,000 rows. [Read more](#select-sample-n) |
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								| `SAMPLE k OFFSET m` | Here `k` and `m` are the numbers from 0 to 1.</br>The query is executed on a sample of `k` fraction of the data. The data used for the sample is offset by `m` fraction. [Read more](#select-sample-offset) |
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
-												Fix SAMPLE desc

											
										
										
											2019-02-19 13:29:49 +00:00
-												fixes in process

											
										
										
											2019-02-22 13:28:26 +00:00
+								#### SAMPLE k {#select-sample-k}
-												Fix SAMPLE desc

											
										
										
											2019-02-19 13:29:49 +00:00
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								Here `k` is the number from 0 to 1 (both fractional and decimal notations are supported). For example, `SAMPLE 1/2` or `SAMPLE 0.5`.
-												Doc fix: Update info about system.columns + system.tables (en) (#4838)


											
										
										
											2019-03-29 14:44:12 +00:00
+								In a `SAMPLE k` clause, the sample is taken from the `k` fraction of data. The example is shown below:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT
 								    Title,
 								    count() * 10 AS PageViews
 								FROM hits_distributed
 								SAMPLE 0.1
 								WHERE
 								    CounterID = 34
 								GROUP BY Title
 								ORDER BY PageViews DESC LIMIT 1000
 								```
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								In this example, the query is executed on a sample from 0.1 (10%) of data. Values of aggregate functions are not corrected automatically, so to get an approximate result, the value `count()` is manually multiplied by 10.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												fixes in process

											
										
										
											2019-02-22 13:28:26 +00:00
+								#### SAMPLE n {#select-sample-n}
-												Fix SAMPLE desc

											
										
										
											2019-02-19 13:29:49 +00:00
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								Here `n` is a sufficiently large integer. For example, `SAMPLE 10000000`.
 								In this case, the query is executed on a sample of at least `n` rows (but not significantly more than this). For example, `SAMPLE 10000000` runs the query on a minimum of 10,000,000 rows.
-												Fix SAMPLE desc

											
										
										
											2019-02-19 13:29:49 +00:00
-												fixes in process

											
										
										
											2019-02-22 13:28:26 +00:00
+								Since the minimum unit for data reading is one granule (its size is set by the `index_granularity` setting), it makes sense to set a sample that is much larger than the size of the granule.
-												Fix SAMPLE desc

											
										
										
											2019-02-19 13:29:49 +00:00
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								When using the `SAMPLE n` clause, you don't know which relative percent of data was processed. So you don't know the coefficient the aggregate functions should be multiplied by. Use the `_sample_factor` virtual column to get the approximate result.
 								The `_sample_factor` column contains relative coefficients that are calculated dynamically. This column is created automatically when you [create](../operations/table_engines/mergetree.md#table_engine-mergetree-creating-a-table) a table with the specified sampling key. The usage examples of the `_sample_factor` column are shown below.
-												fixes in process

											
										
										
											2019-02-22 13:28:26 +00:00
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								Let's consider the table `visits`, which contains the statistics about site visits. The first example shows how to calculate the number of page views:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: change logic of SAMPLE n

											
										
										
											2019-02-19 07:02:32 +00:00
+								``` sql
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								SELECT sum(PageViews * _sample_factor)
-												Doc fix: change logic of SAMPLE n

											
										
										
											2019-02-19 07:02:32 +00:00
+								FROM visits
 								SAMPLE 10000000
 								```
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								The next example shows how to calculate the total number of visits:
-												Doc fix: change logic of SAMPLE n

											
										
										
											2019-02-19 07:02:32 +00:00
 								``` sql
-												fixes

											
										
										
											2019-02-21 13:43:09 +00:00
+								SELECT sum(_sample_factor)
-												Doc fix: change logic of SAMPLE n

											
										
										
											2019-02-19 07:02:32 +00:00
+								FROM visits
 								SAMPLE 10000000
 								```
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								The example below shows how to calculate the average session duration. Note that you don't need to use the relative coefficient to calculate the average values.
-												Doc fix: change logic of SAMPLE n

											
										
										
											2019-02-19 07:02:32 +00:00
 								``` sql
 								SELECT avg(Duration)
 								FROM visits
 								SAMPLE 10000000
 								```
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
-												fixes in process

											
										
										
											2019-02-22 13:28:26 +00:00
+								#### SAMPLE k OFFSET m {#select-sample-offset}
-												Adding SAMPLE .. OFFSET to doc (#4289)

* Add description of the SAMPLE OFFSET clauseH[D

* Add description of the SAMPLE OFFSET clause

* fix sample offset doc

* review fix

											
										
										
											2019-02-07 21:11:53 +00:00
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								Here `k` and `m` are numbers from 0 to 1. Examples are shown below.
-												Adding SAMPLE .. OFFSET to doc (#4289)

* Add description of the SAMPLE OFFSET clauseH[D

* Add description of the SAMPLE OFFSET clause

* fix sample offset doc

* review fix

											
										
										
											2019-02-07 21:11:53 +00:00
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								**Example 1**
-												Adding SAMPLE .. OFFSET to doc (#4289)

* Add description of the SAMPLE OFFSET clauseH[D

* Add description of the SAMPLE OFFSET clause

* fix sample offset doc

* review fix

											
										
										
											2019-02-07 21:11:53 +00:00
 								``` sql
 								SAMPLE 1/10
 								```
-												Doc fix: Update info about system.columns + system.tables (en) (#4838)


											
										
										
											2019-03-29 14:44:12 +00:00
+								In this example, the sample is 1/10th of all data:
-												Adding SAMPLE .. OFFSET to doc (#4289)

* Add description of the SAMPLE OFFSET clauseH[D

* Add description of the SAMPLE OFFSET clause

* fix sample offset doc

* review fix

											
										
										
											2019-02-07 21:11:53 +00:00
 								`[++------------------]`
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								**Example 2**
-												Adding SAMPLE .. OFFSET to doc (#4289)

* Add description of the SAMPLE OFFSET clauseH[D

* Add description of the SAMPLE OFFSET clause

* fix sample offset doc

* review fix

											
										
										
											2019-02-07 21:11:53 +00:00
 								``` sql
 								SAMPLE 1/10 OFFSET 1/2
 								```
-												Doc fix: Edit info about SAMPLE (en, ru) (#4907)


											
										
										
											2019-04-05 18:46:51 +00:00
+								Here, a sample of 10% is taken from the second half of the data.
-												Adding SAMPLE .. OFFSET to doc (#4289)

* Add description of the SAMPLE OFFSET clauseH[D

* Add description of the SAMPLE OFFSET clause

* fix sample offset doc

* review fix

											
										
										
											2019-02-07 21:11:53 +00:00
 								`[----------++--------]`
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
-												Doc fixes: remove all anchors <a> (#3897)

* Doc fixes: rm anchors <a>

* Doc fixes: rm anchors <a>

* Doc fixes: fix links

* Doc fixes: fix the links

											
										
										
											2018-12-21 19:23:55 +00:00
+								### ARRAY JOIN Clause {#select-array-join-clause}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								Allows executing `JOIN` with an array or nested data structure. The intent is similar to the [arrayJoin](functions/array_join.md#functions_arrayjoin) function, but its functionality is broader.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
 								SELECT <expr_list>
 								FROM <left_subquery>
 								[LEFT] ARRAY JOIN <array>
 								[WHERE|PREWHERE <expr>]
 								...
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								You can specify only a single `ARRAY JOIN` clause in a query.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								The query execution order is optimized when running `ARRAY JOIN`. Although `ARRAY JOIN` must always be specified before the `WHERE/PREWHERE` clause, it can be performed either before `WHERE/PREWHERE` (if the result is needed in this clause), or after completing it (to reduce the volume of calculations). The processing order is controlled by the query optimizer.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								Supported types of `ARRAY JOIN` are listed below:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								- `ARRAY JOIN` - In this case, empty arrays are not included in the result of `JOIN`.
 								- `LEFT ARRAY JOIN` - The result of `JOIN` contains rows with empty arrays. The value for an empty array is set to the default value for the array element type (usually 0, empty string or NULL).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								The examples below demonstrate the usage of the `ARRAY JOIN` and `LEFT ARRAY JOIN` clauses. Let's create a table with an [Array](../data_types/array.md) type column and insert values into it:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
 								CREATE TABLE arrays_test
 								(
 								    s String,
 								    arr Array(UInt8)
 								) ENGINE = Memory;
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								INSERT INTO arrays_test
 								VALUES ('Hello', [1,2]), ('World', [3,4,5]), ('Goodbye', []);
 								```
 								```
 								┌─s───────────┬─arr─────┐
 								│ Hello       │ [1,2]   │
 								│ World       │ [3,4,5] │
 								│ Goodbye     │ []      │
 								└─────────────┴─────────┘
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								The example below uses the `ARRAY JOIN` clause:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, arr
 								FROM arrays_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN arr;
 								```
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─arr─┐
 								│ Hello │   1 │
 								│ Hello │   2 │
 								│ World │   3 │
 								│ World │   4 │
 								│ World │   5 │
 								└───────┴─────┘
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								The next example uses the `LEFT ARRAY JOIN` clause:
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
 								``` sql
 								SELECT s, arr
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								FROM arrays_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								LEFT ARRAY JOIN arr;
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								```
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								┌─s───────────┬─arr─┐
 								│ Hello       │   1 │
 								│ Hello       │   2 │
 								│ World       │   3 │
 								│ World       │   4 │
 								│ World       │   5 │
 								│ Goodbye     │   0 │
 								└─────────────┴─────┘
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								#### Using Aliases
 								An alias can be specified for an array in the `ARRAY JOIN` clause. In this case, an array item can be accessed by this alias, but the array itself is accessed by the original name. Example:
 								``` sql
 								SELECT s, arr, a
 								FROM arrays_test
 								ARRAY JOIN arr AS a;
 								```
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								┌─s─────┬─arr─────┬─a─┐
 								│ Hello │ [1,2]   │ 1 │
 								│ Hello │ [1,2]   │ 2 │
 								│ World │ [3,4,5] │ 3 │
 								│ World │ [3,4,5] │ 4 │
 								│ World │ [3,4,5] │ 5 │
 								└───────┴─────────┴───┘
 								```
 								Using aliases, you can perform `ARRAY JOIN` with an external array. For example:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
 								SELECT s, arr_external
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								FROM arrays_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN [1, 2, 3] AS arr_external;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								```
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								┌─s───────────┬─arr_external─┐
 								│ Hello       │            1 │
 								│ Hello       │            2 │
 								│ Hello       │            3 │
 								│ World       │            1 │
 								│ World       │            2 │
 								│ World       │            3 │
 								│ Goodbye     │            1 │
 								│ Goodbye     │            2 │
 								│ Goodbye     │            3 │
 								└─────────────┴──────────────┘
 								```
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								Multiple arrays can be comma-separated in the `ARRAY JOIN` clause. In this case, `JOIN` is performed with them simultaneously (the direct sum, not the cartesian product). Note that all the arrays must have the same size. Example:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, arr, a, num, mapped
 								FROM arrays_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num, arrayMap(x -> x + 1, arr) AS mapped;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─arr─────┬─a─┬─num─┬─mapped─┐
 								│ Hello │ [1,2]   │ 1 │   1 │      2 │
 								│ Hello │ [1,2]   │ 2 │   2 │      3 │
 								│ World │ [3,4,5] │ 3 │   1 │      4 │
 								│ World │ [3,4,5] │ 4 │   2 │      5 │
 								│ World │ [3,4,5] │ 5 │   3 │      6 │
 								└───────┴─────────┴───┴─────┴────────┘
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								The example below uses the [arrayEnumerate](functions/array_functions.md#array_functions-arrayenumerate) function:
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, arr, a, num, arrayEnumerate(arr)
 								FROM arrays_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─arr─────┬─a─┬─num─┬─arrayEnumerate(arr)─┐
 								│ Hello │ [1,2]   │ 1 │   1 │ [1,2]               │
 								│ Hello │ [1,2]   │ 2 │   2 │ [1,2]               │
 								│ World │ [3,4,5] │ 3 │   1 │ [1,2,3]             │
 								│ World │ [3,4,5] │ 4 │   2 │ [1,2,3]             │
 								│ World │ [3,4,5] │ 5 │   3 │ [1,2,3]             │
 								└───────┴─────────┴───┴─────┴─────────────────────┘
 								```
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								#### ARRAY JOIN With Nested Data Structure
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								`ARRAY `JOIN`` also works with [nested data structures](../data_types/nested_data_structures/nested.md). Example:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								CREATE TABLE nested_test
 								(
 								    s String,
 								    nest Nested(
 								    x UInt8,
 								    y UInt32)
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								) ENGINE = Memory;
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								INSERT INTO nested_test
 								VALUES ('Hello', [1,2], [10,20]), ('World', [3,4,5], [30,40,50]), ('Goodbye', [], []);
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s───────┬─nest.x──┬─nest.y─────┐
 								│ Hello   │ [1,2]   │ [10,20]    │
 								│ World   │ [3,4,5] │ [30,40,50] │
 								│ Goodbye │ []      │ []         │
 								└─────────┴─────────┴────────────┘
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, `nest.x`, `nest.y`
 								FROM nested_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN nest;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─nest.x─┬─nest.y─┐
 								│ Hello │      1 │     10 │
 								│ Hello │      2 │     20 │
 								│ World │      3 │     30 │
 								│ World │      4 │     40 │
 								│ World │      5 │     50 │
 								└───────┴────────┴────────┘
 								```
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								When specifying names of nested data structures in `ARRAY JOIN`, the meaning is the same as `ARRAY JOIN` with all the array elements that it consists of. Examples are listed below:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, `nest.x`, `nest.y`
 								FROM nested_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN `nest.x`, `nest.y`;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─nest.x─┬─nest.y─┐
 								│ Hello │      1 │     10 │
 								│ Hello │      2 │     20 │
 								│ World │      3 │     30 │
 								│ World │      4 │     40 │
 								│ World │      5 │     50 │
 								└───────┴────────┴────────┘
 								```
 								This variation also makes sense:
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, `nest.x`, `nest.y`
 								FROM nested_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN `nest.x`;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─nest.x─┬─nest.y─────┐
 								│ Hello │      1 │ [10,20]    │
 								│ Hello │      2 │ [10,20]    │
 								│ World │      3 │ [30,40,50] │
 								│ World │      4 │ [30,40,50] │
 								│ World │      5 │ [30,40,50] │
 								└───────┴────────┴────────────┘
 								```
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								An alias may be used for a nested data structure, in order to select either the `JOIN` result or the source array. Example:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`
 								FROM nested_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN nest AS n;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┐
 								│ Hello │   1 │  10 │ [1,2]   │ [10,20]    │
 								│ Hello │   2 │  20 │ [1,2]   │ [10,20]    │
 								│ World │   3 │  30 │ [3,4,5] │ [30,40,50] │
 								│ World │   4 │  40 │ [3,4,5] │ [30,40,50] │
 								│ World │   5 │  50 │ [3,4,5] │ [30,40,50] │
 								└───────┴─────┴─────┴─────────┴────────────┘
 								```
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								Example of using the [arrayEnumerate](functions/array_functions.md#array_functions-arrayenumerate) function:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`, num
 								FROM nested_test
-												Doc fix: edit `LEFT ARRAY JOIN` (#4734)


											
										
										
											2019-03-21 19:40:48 +00:00
+								ARRAY JOIN nest AS n, arrayEnumerate(`nest.x`) AS num;
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┬─num─┐
 								│ Hello │   1 │  10 │ [1,2]   │ [10,20]    │   1 │
 								│ Hello │   2 │  20 │ [1,2]   │ [10,20]    │   2 │
 								│ World │   3 │  30 │ [3,4,5] │ [30,40,50] │   1 │
 								│ World │   4 │  40 │ [3,4,5] │ [30,40,50] │   2 │
 								│ World │   5 │  50 │ [3,4,5] │ [30,40,50] │   3 │
 								└───────┴─────┴─────┴─────────┴────────────┴─────┘
 								```
-												Docapi 4994 registry (#4214)


											
										
										
											2019-02-04 13:30:28 +00:00
+								### JOIN Clause {#select-join}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								Joins the data in the normal [SQL JOIN](https://en.wikipedia.org/wiki/Join_(SQL)) sense.
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
 								!!! info "Note"
-												Doc fixes: remove all anchors <a> (#3897)

* Doc fixes: rm anchors <a>

* Doc fixes: rm anchors <a>

* Doc fixes: fix links

* Doc fixes: fix the links

											
										
										
											2018-12-21 19:23:55 +00:00
+								    Not related to [ARRAY JOIN](#select-array-join-clause).
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								SELECT <expr_list>
 								FROM <left_subquery>
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								[GLOBAL] [ANY|ALL] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER] JOIN <right_subquery>
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								(ON <expr_list>)|(USING <column_list>) ...
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								```
-												WIP on docs (#3813)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

* Update some links on front page

* Remove some outdated comment

* Add twitter link to front page

* More front page links tuning

* Add Amsterdam meetup link

* Smaller font to avoid second line

* Add Amsterdam link to README.md

* Proper docs nav translation

* Back to 300 font-weight except Chinese

* fix docs build

* Update Amsterdam link

* remove symlinks

* more zh punctuation

* apply lost comment by @zhang2014

* Apply comments by @zhang2014 from #3417

* Remove Beijing link

* rm incorrect symlink

* restore content of docs/zh/operations/table_engines/index.md

* CLICKHOUSE-3751: stem terms while searching docs

* CLICKHOUSE-3751: use English stemmer in non-English docs too

* CLICKHOUSE-4135 fix

* Remove past meetup link

* Add blog link to top nav

* Add ContentSquare article link

* Add form link to front page + refactor some texts

* couple markup fixes

* minor

* Introduce basic ODBC driver page in docs

* More verbose 3rd party libs disclaimer

* Put third-party stuff into a separate folder

* Separate third-party stuff in ToC too

* Update links

* Move stuff that is not really (only) a client library into a separate page

* Add clickhouse-hdfs-loader link

* Some introduction for "interfaces" section

* Rewrite tcp.md

* http_interface.md -> http.md

* fix link

* Remove unconvenient error for now

* try to guess anchor instead of failing

* remove symlink

* Remove outdated info from introduction

* remove ru roadmap.md

* replace ru roadmap.md with symlink

* Update roadmap.md

* lost file

* Title case in toc_en.yml

* Sync "Functions" ToC section with en

* Remove reference to pretty old ClickHouse release from docs

* couple lost symlinks in fa

* Close quote in proper place

* Rewrite en/getting_started/index.md

* Sync en<>ru getting_started/index.md

* minor changes

* Some gui.md refactoring

* Translate DataGrip section to ru

* Translate DataGrip section to zh

* Translate DataGrip section to fa

* Translate DBeaver section to fa

* Translate DBeaver section to zh

* Split third-party GUI to open-source and commercial

* Mention some RDBMS integrations + ad-hoc translation fixes

* Add rel="external nofollow" to outgoing links from docs

* Lost blank lines

* Fix class name

* More rel="external nofollow"

* Apply suggestions by @sundy-li

* Mobile version of front page improvements

* test

* test 2

* test 3

* Update LICENSE

* minor docs fix

* Highlight current article as suggested by @sundy-li

* fix link destination

* Introduce backup.md (only "en" for now)

* Mention INSERT+SELECT in backup.md

* Some improvements for replication.md

* Add backup.md to toc

* Mention clickhouse-backup tool

* Mention LightHouse in third-party GUI list

* Introduce interfaces/third-party/proxy.md

* Add clickhouse-bulk to proxy.md

* Major extension of integrations.md contents

* fix link target

* remove unneeded file

* better toc item name

* fix markdown

* better ru punctuation

* Add yet another possible backup approach

* Simplify copying permalinks to headers

* Support non-eng link anchors in docs + update some deps

* Generate anchors for single-page mode automatically

* Remove anchors to top of pages

* Remove anchors that nobody links to

* build fixes

* fix few links

* restore css

* fix some links

* restore gifs

* fix lost words

* more docs fixes

* docs fixes

* NULL anchor

* update urllib3 dependency

* more fixes

											
										
										
											2018-12-12 17:28:00 +00:00
+								The table names can be specified instead of `<left_subquery>` and `<right_subquery>`. This is equivalent to the `SELECT * FROM table` subquery, except in a special case when the table has the [Join](../operations/table_engines/join.md) engine – an array prepared for joining.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
+								#### Supported Types of `JOIN`
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								- `INNER JOIN` (or `JOIN`)
 								- `LEFT JOIN` (or `LEFT OUTER JOIN`)
 								- `RIGHT JOIN` (or `RIGHT OUTER JOIN`)
 								- `FULL JOIN` (or `FULL OUTER JOIN`)
 								- `CROSS JOIN` (or `,` )
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: Actualize `ARRAY JOIN` (#5065)


											
										
										
											2019-04-22 08:34:25 +00:00
+								See the standard [SQL JOIN](https://en.wikipedia.org/wiki/Join_(SQL)) description.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
+								#### Multiple JOIN
-												DOCAPI-6553: Some clarifications.

											
										
										
											2019-05-21 12:17:30 +00:00
+								Performing queries, ClickHouse rewrites multiple joins into the combination of two-table joins and processes them sequentially. If there are four tables for join ClickHouse joins the first and the second, then joins the result with the third table, and at the last step, it joins the fourth one.
-												DOCAPI-6553: Deleted wrong info.

											
										
										
											2019-05-22 13:50:04 +00:00
+								If a query contains `WHERE` clause, ClickHouse tries to push down filters from this clause into the intermediate join. If it cannot apply the filter to each intermediate join, ClickHouse applies the filters after all joins are completed.
-												DOCAPI-6553: Some clarifications.

											
										
										
											2019-05-21 12:17:30 +00:00
-												DOCAPI-6553: Tried to avoid terms short and clear syntax.

											
										
										
											2019-05-22 13:48:28 +00:00
+								We recommend the `JOIN ON` or `JOIN USING` syntax for creating a query. For example:
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
-												DOCAPI-6553: Tried to avoid terms short and clear syntax.

											
										
										
											2019-05-22 13:48:28 +00:00
+								```
 								SELECT * FROM t1 JOIN t2 ON t1.a = t2.a JOIN t3 ON t1.a = t3.a
 								```
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
-												DOCAPI-6553: Tried to avoid terms short and clear syntax.

											
										
										
											2019-05-22 13:48:28 +00:00
+								Also, you can use comma separated list of tables for join. Works only with the [allow_experimental_cross_to_join_conversion = 1](../operations/settings/settings.md#settings-allow_experimental_cross_to_join_conversion) setting.
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
 								    For example, `SELECT * FROM t1, t2, t3 WHERE t1.a = t2.a AND t1.a = t3.a`
 								Don't mix these syntaxes.
-												DOCAPI-6553: Tried to avoid terms short and clear syntax.

											
										
										
											2019-05-22 13:48:28 +00:00
+								ClickHouse doesn't support the syntax with commas directly, so we don't recommend to use it. The algorithm tries to rewrite the query in terms of `CROSS` and `INNER` `JOIN` clauses and then proceeds the query processing. When rewriting the query, ClickHouse tries to optimize performance and memory consumption. By default, ClickHouse treats comma as an `INNER JOIN` clause and converts it to `CROSS JOIN` when the algorithm cannot guaranty that `INNER JOIN` returns required data.
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
 								#### ANY or ALL Strictness
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								If `ALL` is specified and the right table has several matching rows, the data will be multiplied by the number of these rows. This is the normal `JOIN` behavior for standard SQL.
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								If `ANY` is specified and the right table has several matching rows, only the first one found is joined. If the right table has only one matching row, the results of `ANY` and `ALL` are the same.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								To set the default strictness value, use the session configuration parameter [join_default_strictness](../operations/settings/settings.md#settings-join_default_strictness).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
+								#### GLOBAL JOIN
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								When using a normal `JOIN`, the query is sent to remote servers. Subqueries are run on each of them in order to make the right table, and the join is performed with this table. In other words, the right table is formed on each server separately.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								When using `GLOBAL ... JOIN`, first the requestor server runs a subquery to calculate the right table. This temporary table is passed to each remote server, and queries are run on them using the temporary data that was transmitted.
-												Doc fixes: remove all anchors <a> (#3897)

* Doc fixes: rm anchors <a>

* Doc fixes: rm anchors <a>

* Doc fixes: fix links

* Doc fixes: fix the links

											
										
										
											2018-12-21 19:23:55 +00:00
+								Be careful when using `GLOBAL`. For more information, see the section [Distributed subqueries](#select-distributed-subqueries).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6553: JOIN syntax. JOIN optimization setting description.

											
										
										
											2019-05-21 08:05:56 +00:00
+								#### Usage Recommendations
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								All columns that are not needed for the `JOIN` are deleted from the subquery.
 								When running a `JOIN`, there is no optimization of the order of execution in relation to other stages of the query. The join (a search in the right table) is run before filtering in `WHERE` and before aggregation. In order to explicitly set the processing order, we recommend running a `JOIN` subquery with a subquery.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								Example:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT
 								    CounterID,
 								    hits,
 								    visits
 								FROM
 								(
 								    SELECT
 								        CounterID,
 								        count() AS hits
 								    FROM test.hits
 								    GROUP BY CounterID
 								) ANY LEFT JOIN
 								(
 								    SELECT
 								        CounterID,
 								        sum(Sign) AS visits
 								    FROM test.visits
 								    GROUP BY CounterID
 								) USING CounterID
 								ORDER BY hits DESC
 								LIMIT 10
 								```
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌─CounterID─┬───hits─┬─visits─┐
 								│   1143050 │ 523264 │  13665 │
 								│    731962 │ 475698 │ 102716 │
 								│    722545 │ 337212 │ 108187 │
 								│    722889 │ 252197 │  10547 │
 								│   2237260 │ 196036 │   9522 │
 								│  23057320 │ 147211 │   7689 │
 								│    722818 │  90109 │  17847 │
 								│     48221 │  85379 │   4652 │
 								│  19762435 │  77807 │   7026 │
 								│    722884 │  77492 │  11056 │
 								└───────────┴────────┴────────┘
 								```
 								Subqueries don't allow you to set names or use them for referencing a column from a specific subquery.
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								The columns specified in `USING` must have the same names in both subqueries, and the other columns must be named differently. You can use aliases to change the names of columns in subqueries (the example uses the aliases 'hits' and 'visits').
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								The `USING` clause specifies one or more columns to join, which establishes the equality of these columns. The list of columns is set without brackets. More complex join conditions are not supported.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								The right table (the subquery result) resides in RAM. If there isn't enough memory, you can't run a `JOIN`.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								Each time a query is run with the same `JOIN`, the subquery is run again because the result is not cached. To avoid this, use the special [Join](../operations/table_engines/join.md) table engine, which is a prepared array for joining that is always in RAM.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								In some cases, it is more efficient to use `IN` instead of `JOIN`.
-												Translate select document to chinese (#3791)


											
										
										
											2018-12-08 19:42:49 +00:00
+								Among the various types of `JOIN`, the most efficient is `ANY LEFT JOIN`, then `ANY INNER JOIN`. The least efficient are `ALL LEFT JOIN` and `ALL INNER JOIN`.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												WIP on docs (#3813)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

* Update some links on front page

* Remove some outdated comment

* Add twitter link to front page

* More front page links tuning

* Add Amsterdam meetup link

* Smaller font to avoid second line

* Add Amsterdam link to README.md

* Proper docs nav translation

* Back to 300 font-weight except Chinese

* fix docs build

* Update Amsterdam link

* remove symlinks

* more zh punctuation

* apply lost comment by @zhang2014

* Apply comments by @zhang2014 from #3417

* Remove Beijing link

* rm incorrect symlink

* restore content of docs/zh/operations/table_engines/index.md

* CLICKHOUSE-3751: stem terms while searching docs

* CLICKHOUSE-3751: use English stemmer in non-English docs too

* CLICKHOUSE-4135 fix

* Remove past meetup link

* Add blog link to top nav

* Add ContentSquare article link

* Add form link to front page + refactor some texts

* couple markup fixes

* minor

* Introduce basic ODBC driver page in docs

* More verbose 3rd party libs disclaimer

* Put third-party stuff into a separate folder

* Separate third-party stuff in ToC too

* Update links

* Move stuff that is not really (only) a client library into a separate page

* Add clickhouse-hdfs-loader link

* Some introduction for "interfaces" section

* Rewrite tcp.md

* http_interface.md -> http.md

* fix link

* Remove unconvenient error for now

* try to guess anchor instead of failing

* remove symlink

* Remove outdated info from introduction

* remove ru roadmap.md

* replace ru roadmap.md with symlink

* Update roadmap.md

* lost file

* Title case in toc_en.yml

* Sync "Functions" ToC section with en

* Remove reference to pretty old ClickHouse release from docs

* couple lost symlinks in fa

* Close quote in proper place

* Rewrite en/getting_started/index.md

* Sync en<>ru getting_started/index.md

* minor changes

* Some gui.md refactoring

* Translate DataGrip section to ru

* Translate DataGrip section to zh

* Translate DataGrip section to fa

* Translate DBeaver section to fa

* Translate DBeaver section to zh

* Split third-party GUI to open-source and commercial

* Mention some RDBMS integrations + ad-hoc translation fixes

* Add rel="external nofollow" to outgoing links from docs

* Lost blank lines

* Fix class name

* More rel="external nofollow"

* Apply suggestions by @sundy-li

* Mobile version of front page improvements

* test

* test 2

* test 3

* Update LICENSE

* minor docs fix

* Highlight current article as suggested by @sundy-li

* fix link destination

* Introduce backup.md (only "en" for now)

* Mention INSERT+SELECT in backup.md

* Some improvements for replication.md

* Add backup.md to toc

* Mention clickhouse-backup tool

* Mention LightHouse in third-party GUI list

* Introduce interfaces/third-party/proxy.md

* Add clickhouse-bulk to proxy.md

* Major extension of integrations.md contents

* fix link target

* remove unneeded file

* better toc item name

* fix markdown

* better ru punctuation

* Add yet another possible backup approach

* Simplify copying permalinks to headers

* Support non-eng link anchors in docs + update some deps

* Generate anchors for single-page mode automatically

* Remove anchors to top of pages

* Remove anchors that nobody links to

* build fixes

* fix few links

* restore css

* fix some links

* restore gifs

* fix lost words

* more docs fixes

* docs fixes

* NULL anchor

* update urllib3 dependency

* more fixes

											
										
										
											2018-12-12 17:28:00 +00:00
+								If you need a `JOIN` for joining with dimension tables (these are relatively small tables that contain dimension properties, such as names for advertising campaigns), a `JOIN` might not be very convenient due to the bulky syntax and the fact that the right table is re-accessed for every query. For such cases, there is an "external dictionaries" feature that you should use instead of `JOIN`. For more information, see the section [External dictionaries](dicts/external_dicts.md).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								#### Processing of Empty or NULL Cells
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								While joining tables, the empty cells may appear. The setting [join_use_nulls](../operations/settings/settings.md#settings-join_use_nulls) define how ClickHouse fills these cells.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												DOCAPI-3820: EN review and RU translation for SELECT JOIN (#4422)


											
										
										
											2019-02-19 16:28:51 +00:00
+								If the `JOIN` keys are [Nullable](../data_types/nullable.md) fields, the rows where at least one of the keys has the value [NULL](syntax.md#null-literal) are not joined.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												Translate select document to chinese (#3791)


											
										
										
											2018-12-08 19:42:49 +00:00
 								### WHERE Clause
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								If there is a WHERE clause, it must contain an expression with the UInt8 type. This is usually an expression with comparison and logical operators.
 								This expression will be used for filtering data before all other transformations.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								If indexes are supported by the database table engine, the expression is evaluated on the ability to use indexes.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												Capitalize most words in titles of docs/en/

											
										
										
											2018-08-07 16:49:40 +00:00
+								### PREWHERE Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								This clause has the same meaning as the WHERE clause. The difference is in which data is read from the table.
 								When using PREWHERE, first only the columns necessary for executing PREWHERE are read. Then the other columns are read that are needed for running the query, but only those blocks where the PREWHERE expression is true.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Adjusted documentation because sometimes it makes sense to use PREWHERE even by indexed columns #2601

											
										
										
											2018-12-04 19:44:51 +00:00
+								It makes sense to use PREWHERE if there are filtration conditions that are used by a minority of the columns in the query, but that provide strong data filtration. This reduces the volume of data to read.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								For example, it is useful to write PREWHERE for queries that extract a large number of columns, but that only have filtration for a few columns.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								PREWHERE is only supported by tables from the `*MergeTree` family.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								A query may simultaneously specify PREWHERE and WHERE. In this case, PREWHERE precedes WHERE.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								If the 'optimize_move_to_prewhere' setting is set to 1 and PREWHERE is omitted, the system uses heuristics to automatically move parts of expressions from WHERE to PREWHERE.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Docapi 4994 registry (#4214)


											
										
										
											2019-02-04 13:30:28 +00:00
+								### GROUP BY Clause {#select-group-by-clause}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								This is one of the most important parts of a column-oriented DBMS.
 								If there is a GROUP BY clause, it must contain a list of expressions. Each expression will be referred to here as a "key".
 								All the expressions in the SELECT, HAVING, and ORDER BY clauses must be calculated from keys or from aggregate functions. In other words, each column selected from the table must be used either in keys or inside aggregate functions.
 								If a query contains only table columns inside aggregate functions, the GROUP BY clause can be omitted, and aggregation by an empty set of keys is assumed.
 								Example:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT
 								    count(),
 								    median(FetchTiming > 60 ? 60 : FetchTiming),
 								    count() - sum(Refresh)
 								FROM hits
 								```
 								However, in contrast to standard SQL, if the table doesn't have any rows (either there aren't any at all, or there aren't any after using WHERE to filter), an empty result is returned, and not the result from one of the rows containing the initial values of aggregate functions.
 								As opposed to MySQL (and conforming to standard SQL), you can't get some value of some column that is not in a key or aggregate function (except constant expressions). To work around this, you can use the 'any' aggregate function (get the first encountered value) or 'min/max'.
 								Example:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT
 								    domainWithoutWWW(URL) AS domain,
 								    count(),
-												English translation is updated.

											
										
										
											2018-04-23 06:20:21 +00:00
+								    any(Title) AS title -- getting the first occurred page header for each domain.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								FROM hits
 								GROUP BY domain
 								```
 								For every different key value encountered, GROUP BY calculates a set of aggregate function values.
 								GROUP BY is not supported for array columns.
 								A constant can't be specified as arguments for aggregate functions. Example: sum(1). Instead of this, you can get rid of the constant. Example: `count()`.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
+								#### NULL processing
-												WIP on docs (#3813)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

* Update some links on front page

* Remove some outdated comment

* Add twitter link to front page

* More front page links tuning

* Add Amsterdam meetup link

* Smaller font to avoid second line

* Add Amsterdam link to README.md

* Proper docs nav translation

* Back to 300 font-weight except Chinese

* fix docs build

* Update Amsterdam link

* remove symlinks

* more zh punctuation

* apply lost comment by @zhang2014

* Apply comments by @zhang2014 from #3417

* Remove Beijing link

* rm incorrect symlink

* restore content of docs/zh/operations/table_engines/index.md

* CLICKHOUSE-3751: stem terms while searching docs

* CLICKHOUSE-3751: use English stemmer in non-English docs too

* CLICKHOUSE-4135 fix

* Remove past meetup link

* Add blog link to top nav

* Add ContentSquare article link

* Add form link to front page + refactor some texts

* couple markup fixes

* minor

* Introduce basic ODBC driver page in docs

* More verbose 3rd party libs disclaimer

* Put third-party stuff into a separate folder

* Separate third-party stuff in ToC too

* Update links

* Move stuff that is not really (only) a client library into a separate page

* Add clickhouse-hdfs-loader link

* Some introduction for "interfaces" section

* Rewrite tcp.md

* http_interface.md -> http.md

* fix link

* Remove unconvenient error for now

* try to guess anchor instead of failing

* remove symlink

* Remove outdated info from introduction

* remove ru roadmap.md

* replace ru roadmap.md with symlink

* Update roadmap.md

* lost file

* Title case in toc_en.yml

* Sync "Functions" ToC section with en

* Remove reference to pretty old ClickHouse release from docs

* couple lost symlinks in fa

* Close quote in proper place

* Rewrite en/getting_started/index.md

* Sync en<>ru getting_started/index.md

* minor changes

* Some gui.md refactoring

* Translate DataGrip section to ru

* Translate DataGrip section to zh

* Translate DataGrip section to fa

* Translate DBeaver section to fa

* Translate DBeaver section to zh

* Split third-party GUI to open-source and commercial

* Mention some RDBMS integrations + ad-hoc translation fixes

* Add rel="external nofollow" to outgoing links from docs

* Lost blank lines

* Fix class name

* More rel="external nofollow"

* Apply suggestions by @sundy-li

* Mobile version of front page improvements

* test

* test 2

* test 3

* Update LICENSE

* minor docs fix

* Highlight current article as suggested by @sundy-li

* fix link destination

* Introduce backup.md (only "en" for now)

* Mention INSERT+SELECT in backup.md

* Some improvements for replication.md

* Add backup.md to toc

* Mention clickhouse-backup tool

* Mention LightHouse in third-party GUI list

* Introduce interfaces/third-party/proxy.md

* Add clickhouse-bulk to proxy.md

* Major extension of integrations.md contents

* fix link target

* remove unneeded file

* better toc item name

* fix markdown

* better ru punctuation

* Add yet another possible backup approach

* Simplify copying permalinks to headers

* Support non-eng link anchors in docs + update some deps

* Generate anchors for single-page mode automatically

* Remove anchors to top of pages

* Remove anchors that nobody links to

* build fixes

* fix few links

* restore css

* fix some links

* restore gifs

* fix lost words

* more docs fixes

* docs fixes

* NULL anchor

* update urllib3 dependency

* more fixes

											
										
										
											2018-12-12 17:28:00 +00:00
+								For grouping, ClickHouse interprets [NULL](syntax.md) as a value, and `NULL=NULL`.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								Here's an example to show what this means.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								Assume you have this table:
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								```
 								┌─x─┬────y─┐
 								│ 1 │    2 │
 								│ 2 │ ᴺᵁᴸᴸ │
 								│ 3 │    2 │
 								│ 3 │    3 │
 								│ 3 │ ᴺᵁᴸᴸ │
 								└───┴──────┘
 								```
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								The query `SELECT sum(x), y FROM t_null_big GROUP BY y` results in:
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								```
 								┌─sum(x)─┬────y─┐
 								│      4 │    2 │
 								│      3 │    3 │
 								│      5 │ ᴺᵁᴸᴸ │
 								└────────┴──────┘
 								```
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								You can see that `GROUP BY` for `У = NULL` summed up `x`, as if `NULL` is this value.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								If you pass several keys to `GROUP BY`, the result will give you all the combinations of the selection, as if `NULL` were a specific value.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								#### WITH TOTALS Modifier
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								If the WITH TOTALS modifier is specified, another row will be calculated. This row will have key columns containing default values (zeros or empty lines), and columns of aggregate functions with the values calculated across all the rows (the "total" values).
 								This extra row is output in JSON\*, TabSeparated\*, and Pretty\* formats, separately from the other rows. In the other formats, this row is not output.
 								In JSON\* formats, this row is output as a separate 'totals' field. In TabSeparated\* formats, the row comes after the main result, preceded by an empty row (after the other data). In Pretty\* formats, the row is output as a separate table after the main result.
 								`WITH TOTALS` can be run in different ways when HAVING is present. The behavior depends on the 'totals_mode' setting.
 								By default, `totals_mode = 'before_having'`. In this case, 'totals' is calculated across all rows, including the ones that don't pass through HAVING and 'max_rows_to_group_by'.
 								The other alternatives include only the rows that pass through HAVING in 'totals', and behave differently with the setting `max_rows_to_group_by` and `group_by_overflow_mode = 'any'`.
 								`after_having_exclusive` – Don't include rows that didn't pass through `max_rows_to_group_by`. In other words, 'totals' will have less than or the same number of rows as it would if `max_rows_to_group_by` were omitted.
 								`after_having_inclusive` – Include all the rows that didn't pass through 'max_rows_to_group_by' in 'totals'. In other words, 'totals' will have more than or the same number of rows as it would if `max_rows_to_group_by` were omitted.
 								`after_having_auto` – Count the number of rows that passed through HAVING. If it is more than a certain amount (by default, 50%), include all the rows that didn't pass through 'max_rows_to_group_by' in 'totals'. Otherwise, do not include them.
 								`totals_auto_threshold` – By default, 0.5. The coefficient for `after_having_auto`.
 								If `max_rows_to_group_by` and `group_by_overflow_mode = 'any'` are not used, all variations of `after_having` are the same, and you can use any of them (for example, `after_having_auto`).
 								You can use WITH TOTALS in subqueries, including subqueries in the JOIN clause (in this case, the respective total values are combined).
-												Docapi 4994 registry (#4214)


											
										
										
											2019-02-04 13:30:28 +00:00
+								#### GROUP BY in External Memory {#select-group-by-in-external-memory}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								You can enable dumping temporary data to the disk to restrict memory usage during GROUP BY.
 								The `max_bytes_before_external_group_by` setting determines the threshold RAM consumption for dumping GROUP BY temporary data to the file system. If set to 0 (the default), it is disabled.
 								When using `max_bytes_before_external_group_by`, we recommend that you set max_memory_usage about twice as high. This is necessary because there are two stages to aggregation: reading the date and forming intermediate data (1) and merging the intermediate data (2). Dumping data to the file system can only occur during stage 1. If the temporary data wasn't dumped, then stage 2 might require up to the same amount of memory as in stage 1.
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								For example, if `max_memory_usage` was set to 10000000000 and you want to use external aggregation, it makes sense to set `max_bytes_before_external_group_by` to 10000000000, and max_memory_usage to 20000000000. When external aggregation is triggered (if there was at least one dump of temporary data), maximum consumption of RAM is only slightly more than ` max_bytes_before_external_group_by`.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fixes: remove double placeholders; add them where missing. (#3923)

* Doc fix: add spaces where missing

* Doc fixes: rm double spaces

* Doc fixes: edit spaces

* Doc fixes: rm double spaces in /fa

* Revert "Doc fixes: rm double spaces in /fa"

This reverts commit bb879a62ef5fa965d989fea3b1b2a693d2016a2d.

* Doc fix: resolve all problems with double spaces in /fa

* Doc fix: add spaces for readability

* Doc fix: add spaces

* Fix spaces

											
										
										
											2018-12-25 15:25:43 +00:00
+								With distributed query processing, external aggregation is performed on remote servers. In order for the requestor server to use only a small amount of RAM, set ` distributed_aggregation_memory_efficient` to 1.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								When merging data flushed to the disk, as well as when merging results from remote servers when the ` distributed_aggregation_memory_efficient` setting is enabled, consumes up to 1/256 \* the number of threads from the total amount of RAM.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fixes: remove double placeholders; add them where missing. (#3923)

* Doc fix: add spaces where missing

* Doc fixes: rm double spaces

* Doc fixes: edit spaces

* Doc fixes: rm double spaces in /fa

* Revert "Doc fixes: rm double spaces in /fa"

This reverts commit bb879a62ef5fa965d989fea3b1b2a693d2016a2d.

* Doc fix: resolve all problems with double spaces in /fa

* Doc fix: add spaces for readability

* Doc fix: add spaces

* Fix spaces

											
										
										
											2018-12-25 15:25:43 +00:00
+								When external aggregation is enabled, if there was less than ` max_bytes_before_external_group_by` of data (i.e. data was not flushed), the query runs just as fast as without external aggregation. If any temporary data was flushed, the run time will be several times longer (approximately three times).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								If you have an ORDER BY with a small LIMIT after GROUP BY, then the ORDER BY CLAUSE will not use significant amounts of RAM.
 								But if the ORDER BY doesn't have LIMIT, don't forget to enable external sorting (`max_bytes_before_external_sort`).
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								### LIMIT BY Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554. Clarification of text.

											
										
										
											2019-05-07 14:40:19 +00:00
+								The query with the `LIMIT n BY expressions` clause selects the first `n` rows for each distinct value of `expressions`. The key for `LIMIT BY` can contain any number of [expressions](syntax.md#syntax-expressions).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
+								ClickHouse supports the following syntax:
-												DOCAPI-6554. Clarification of text.

											
										
										
											2019-05-07 14:40:19 +00:00
+								- `LIMIT [offset_value, ]n BY expressions`
 								- `LIMIT n OFFSET offset_value BY expressions`
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
-												DOCAPI-6554. Clarification of text.

											
										
										
											2019-05-07 14:40:19 +00:00
+								During the query processing, ClickHouse selects data ordered by sorting key. Sorting key is set explicitly by [ORDER BY](#select-order-by) clause or implicitly as a property of table engine. Then ClickHouse applies `LIMIT n BY expressions` and returns the first `n` rows for each distinct combination of `expressions`. If `OFFSET` is specified, then for each data block, belonging to a distinct combination of `expressions`, ClickHouse skips `offset_value` rows from the beginning of the block, and returns not more than `n` rows as a result. If `offset_value` is bigger than the number of rows in the data block, then ClickHouse returns no rows from the block.
 								`LIMIT BY` is not related to `LIMIT`, they can both be used in the same query.
-												DOCAPI-6554: Extended syntax for LIMIT BY clause.

											
										
										
											2019-05-07 07:47:26 +00:00
 								**Examples**
 								Sample table:
 								```sql
 								CREATE TABLE limit_by(id Int, val Int) ENGINE = Memory;
 								INSERT INTO limit_by values(1, 10), (1, 11), (1, 12), (2, 20), (2, 21);
 								```
 								Queries:
 								```sql
 								SELECT * FROM limit_by ORDER BY id, val LIMIT 2 BY id
 								```
 								```text
 								┌─id─┬─val─┐
 								│  1 │  10 │
 								│  1 │  11 │
 								│  2 │  20 │
 								│  2 │  21 │
 								└────┴─────┘
 								```
 								```sql
 								SELECT * FROM limit_by ORDER BY id, val LIMIT 1, 2 BY id
 								```
 								```text
 								┌─id─┬─val─┐
 								│  1 │  11 │
 								│  1 │  12 │
 								│  2 │  21 │
 								└────┴─────┘
 								```
 								The `SELECT * FROM limit_by ORDER BY id, val LIMIT 2 OFFSET 1 BY id` query returns the same result.
 								The following query returns the top 5 referrers for each `domain, device_type` pair, but not more than 100 rows (`LIMIT n BY + LIMIT`).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT
 								    domainWithoutWWW(URL) AS domain,
 								    domainWithoutWWW(REFERRER_URL) AS referrer,
 								    device_type,
 								    count() cnt
 								FROM hits
 								GROUP BY domain, referrer, device_type
 								ORDER BY cnt DESC
 								LIMIT 5 BY domain, device_type
 								LIMIT 100
 								```
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								### HAVING Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								Allows filtering the result received after GROUP BY, similar to the WHERE clause.
 								WHERE and HAVING differ in that WHERE is performed before aggregation (GROUP BY), while HAVING is performed after it.
 								If aggregation is not performed, HAVING can't be used.
-												DOCAPI-6554. Clarification of text.

											
										
										
											2019-05-07 14:40:19 +00:00
+								### ORDER BY Clause {#select-order-by}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								The ORDER BY clause contains a list of expressions, which can each be assigned DESC or ASC (the sorting direction). If the direction is not specified, ASC is assumed. ASC is sorted in ascending order, and DESC in descending order. The sorting direction applies to a single expression, not to the entire list. Example: `ORDER BY Visits DESC, SearchPhrase`
 								For sorting by String values, you can specify collation (comparison). Example: `ORDER BY SearchPhrase COLLATE 'tr'` - for sorting by keyword in ascending order, using the Turkish alphabet, case insensitive, assuming that strings are UTF-8 encoded. COLLATE can be specified or not for each expression in ORDER BY independently. If ASC or DESC is specified, COLLATE is specified after it. When using COLLATE, sorting is always case-insensitive.
 								We only recommend using COLLATE for final sorting of a small number of rows, since sorting with COLLATE is less efficient than normal sorting by bytes.
 								Rows that have identical values for the list of sorting expressions are output in an arbitrary order, which can also be nondeterministic (different each time).
 								If the ORDER BY clause is omitted, the order of the rows is also undefined, and may be nondeterministic as well.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
+								`NaN` and `NULL` sorting order:
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								- With the modifier `NULLS FIRST` — First `NULL`, then `NaN`, then other values.
 								- With the modifier `NULLS LAST` — First the values, then `NaN`, then `NULL`.
 								- Default — The same as with the `NULLS LAST` modifier.
 								Example:
 								For the table
 								```
 								┌─x─┬────y─┐
 								│ 1 │ ᴺᵁᴸᴸ │
 								│ 2 │    2 │
 								│ 1 │  nan │
 								│ 2 │    2 │
 								│ 3 │    4 │
 								│ 5 │    6 │
 								│ 6 │  nan │
 								│ 7 │ ᴺᵁᴸᴸ │
 								│ 6 │    7 │
 								│ 8 │    9 │
 								└───┴──────┘
 								```
 								Run the query `SELECT * FROM t_null_nan ORDER BY y NULLS FIRST` to get:
 								```
 								┌─x─┬────y─┐
 								│ 1 │ ᴺᵁᴸᴸ │
 								│ 7 │ ᴺᵁᴸᴸ │
 								│ 1 │  nan │
 								│ 6 │  nan │
 								│ 2 │    2 │
 								│ 2 │    2 │
 								│ 3 │    4 │
 								│ 5 │    6 │
 								│ 6 │    7 │
 								│ 8 │    9 │
 								└───┴──────┘
 								```
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								When floating point numbers are sorted, NaNs are separate from the other values. Regardless of the sorting order, NaNs come at the end. In other words, for ascending sorting they are placed as if they are larger than all the other numbers, while for descending sorting they are placed as if they are smaller than the rest.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								Less RAM is used if a small enough LIMIT is specified in addition to ORDER BY. Otherwise, the amount of memory spent is proportional to the volume of data for sorting. For distributed query processing, if GROUP BY is omitted, sorting is partially done on remote servers, and the results are merged on the requestor server. This means that for distributed sorting, the volume of data to sort can be greater than the amount of memory on a single server.
 								If there is not enough RAM, it is possible to perform sorting in external memory (creating temporary files on a disk). Use the setting `max_bytes_before_external_sort` for this purpose. If it is set to 0 (the default), external sorting is disabled. If it is enabled, when the volume of data to sort reaches the specified number of bytes, the collected data is sorted and dumped into a temporary file. After all data is read, all the sorted files are merged and the results are output. Files are written to the /var/lib/clickhouse/tmp/ directory in the config (by default, but you can use the 'tmp_path' parameter to change this setting).
 								Running a query may use more memory than 'max_bytes_before_external_sort'. For this reason, this setting must have a value significantly smaller than 'max_memory_usage'. As an example, if your server has 128 GB of RAM and you need to run a single query, set 'max_memory_usage' to 100 GB, and 'max_bytes_before_external_sort' to 80 GB.
 								External sorting works much less effectively than sorting in RAM.
-												Capitalize most words in titles of docs/en/

											
										
										
											2018-08-07 16:49:40 +00:00
+								### SELECT Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								The expressions specified in the SELECT clause are analyzed after the calculations for all the clauses listed above are completed.
 								More specifically, expressions are analyzed that are above the aggregate functions, if there are any aggregate functions.
 								The aggregate functions and everything below them are calculated during aggregation (GROUP BY).
 								These expressions work as if they are applied to separate rows in the result.
-												Docapi 4994 registry (#4214)


											
										
										
											2019-02-04 13:30:28 +00:00
+								### DISTINCT Clause {#select-distinct}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								If DISTINCT is specified, only a single row will remain out of all the sets of fully matching rows in the result.
 								The result will be the same as if GROUP BY were specified across all the fields specified in SELECT without aggregate functions. But there are several differences from GROUP BY:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								- DISTINCT can be applied together with GROUP BY.
 								- When ORDER BY is omitted and LIMIT is defined, the query stops running immediately after the required number of different rows has been read.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								- Data blocks are output as they are processed, without waiting for the entire query to finish running.
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								DISTINCT is not supported if SELECT has at least one array column.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fixes: remove double placeholders; add them where missing. (#3923)

* Doc fix: add spaces where missing

* Doc fixes: rm double spaces

* Doc fixes: edit spaces

* Doc fixes: rm double spaces in /fa

* Revert "Doc fixes: rm double spaces in /fa"

This reverts commit bb879a62ef5fa965d989fea3b1b2a693d2016a2d.

* Doc fix: resolve all problems with double spaces in /fa

* Doc fix: add spaces for readability

* Doc fix: add spaces

* Fix spaces

											
										
										
											2018-12-25 15:25:43 +00:00
+								`DISTINCT` works with [NULL](syntax.md) as if `NULL` were a specific value, and `NULL=NULL`. In other words, in the `DISTINCT` results, different combinations with `NULL` only occur once.
-												Translate select document to chinese (#3791)


											
										
										
											2018-12-08 19:42:49 +00:00
-												Capitalize most words in titles of docs/en/

											
										
										
											2018-08-07 16:49:40 +00:00
+								### LIMIT Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: change logic of SAMPLE n

											
										
										
											2019-02-19 07:02:32 +00:00
+								`LIMIT m` allows you to select the first `m` rows from the result.
-												LIMIT OFFSET syntax

											
										
										
											2019-03-26 16:39:49 +00:00
 								`LIMIT n, m` allows you to select the first `m` rows from the result after skipping the first `n` rows. The `LIMIT m OFFSET n` syntax is also supported.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fix: change logic of SAMPLE n

											
										
										
											2019-02-19 07:02:32 +00:00
+								`n` and `m` must be non-negative integers.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												LIMIT OFFSET syntax

											
										
										
											2019-03-26 16:39:49 +00:00
+								If there isn't an `ORDER BY` clause that explicitly sorts results, the result may be arbitrary and nondeterministic.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								### UNION ALL Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								You can use UNION ALL to combine any number of queries. Example:
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT CounterID, 1 AS table, toInt64(count()) AS c
 								    FROM test.hits
 								    GROUP BY CounterID
 								UNION ALL
 								SELECT CounterID, 2 AS table, sum(Sign) AS c
 								    FROM test.visits
 								    GROUP BY CounterID
 								    HAVING c > 0
 								```
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								Only UNION ALL is supported. The regular UNION (UNION DISTINCT) is not supported. If you need UNION DISTINCT, you can write SELECT DISTINCT from a subquery containing UNION ALL.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								Queries that are parts of UNION ALL can be run simultaneously, and their results can be mixed together.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
+								The structure of results (the number and type of columns) must match for the queries. But the column names can differ. In this case, the column names for the final result will be taken from the first query. Type casting is performed for unions. For example, if two queries being combined have the same field with non-`Nullable` and `Nullable` types from a compatible type, the resulting `UNION ALL` has a `Nullable` type field.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								Queries that are parts of UNION ALL can't be enclosed in brackets. ORDER BY and LIMIT are applied to separate queries, not to the final result. If you need to apply a conversion to the final result, you can put all the queries with UNION ALL in a subquery in the FROM clause.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Capitalize most words in titles of docs/en/

											
										
										
											2018-08-07 16:49:40 +00:00
+								### INTO OUTFILE Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								Add the `INTO OUTFILE filename` clause (where filename is a string literal) to redirect query output to the specified file.
 								In contrast to MySQL, the file is created on the client side. The query will fail if a file with the same filename already exists.
 								This functionality is available in the command-line client and clickhouse-local (a query sent via HTTP interface will fail).
 								The default output format is TabSeparated (the same as in the command-line client batch mode).
-												Capitalize most words in titles of docs/en/

											
										
										
											2018-08-07 16:49:40 +00:00
+								### FORMAT Clause
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								Specify 'FORMAT format' to get data in any specified format.
 								You can use this for convenience, or for creating dumps.
 								For more information, see the section "Formats".
 								If the FORMAT clause is omitted, the default format is used, which depends on both the settings and the interface used for accessing the DB. For the HTTP interface and the command-line client in batch mode, the default format is TabSeparated. For the command-line client in interactive mode, the default format is PrettyCompact (it has attractive and compact tables).
 								When using the command-line client, data is passed to the client in an internal efficient format. The client independently interprets the FORMAT clause of the query and formats the data itself (thus relieving the network and the server from the load).
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												Doc fixes: remove all anchors <a> (#3897)

* Doc fixes: rm anchors <a>

* Doc fixes: rm anchors <a>

* Doc fixes: fix links

* Doc fixes: fix the links

											
										
										
											2018-12-21 19:23:55 +00:00
+								### IN Operators {#select-in-operators}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								The `IN`, `NOT IN`, `GLOBAL IN`, and `GLOBAL NOT IN` operators are covered separately, since their functionality is quite rich.
 								The left side of the operator is either a single column or a tuple.
 								Examples:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT UserID IN (123, 456) FROM ...
 								SELECT (CounterID, UserID) IN ((34, 123), (101500, 456)) FROM ...
 								```
 								If the left side is a single column that is in the index, and the right side is a set of constants, the system uses the index for processing the query.
 								Don't list too many values explicitly (i.e. millions). If a data set is large, put it in a temporary table (for example, see the section "External data for query processing"), then use a subquery.
 								The right side of the operator can be a set of constant expressions, a set of tuples with constant expressions (shown in the examples above), or the name of a database table or SELECT subquery in brackets.
 								If the right side of the operator is the name of a table (for example, `UserID IN users`), this is equivalent to the subquery `UserID IN (SELECT * FROM users)`. Use this when working with external data that is sent along with the query. For example, the query can be sent together with a set of user IDs loaded to the 'users' temporary table, which should be filtered.
 								If the right side of the operator is a table name that has the Set engine (a prepared data set that is always in RAM), the data set will not be created over again for each query.
 								The subquery may specify more than one column for filtering tuples.
 								Example:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT (CounterID, UserID) IN (SELECT CounterID, UserID FROM ...) FROM ...
 								```
 								The columns to the left and right of the IN operator should have the same type.
 								The IN operator and subquery may occur in any part of the query, including in aggregate functions and lambda functions.
 								Example:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT
 								    EventDate,
 								    avg(UserID IN
 								    (
 								        SELECT UserID
 								        FROM test.hits
 								        WHERE EventDate = toDate('2014-03-17')
 								    )) AS ratio
 								FROM test.hits
 								GROUP BY EventDate
 								ORDER BY EventDate ASC
 								```
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								┌──EventDate─┬────ratio─┐
 								│ 2014-03-17 │        1 │
 								│ 2014-03-18 │ 0.807696 │
 								│ 2014-03-19 │ 0.755406 │
 								│ 2014-03-20 │ 0.723218 │
 								│ 2014-03-21 │ 0.697021 │
 								│ 2014-03-22 │ 0.647851 │
 								│ 2014-03-23 │ 0.648416 │
 								└────────────┴──────────┘
 								```
 								For each day after March 17th, count the percentage of pageviews made by users who visited the site on March 17th.
 								A subquery in the IN clause is always run just one time on a single server. There are no dependent subqueries.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
+								#### NULL processing
-												Doc fixes: remove double placeholders; add them where missing. (#3923)

* Doc fix: add spaces where missing

* Doc fixes: rm double spaces

* Doc fixes: edit spaces

* Doc fixes: rm double spaces in /fa

* Revert "Doc fixes: rm double spaces in /fa"

This reverts commit bb879a62ef5fa965d989fea3b1b2a693d2016a2d.

* Doc fix: resolve all problems with double spaces in /fa

* Doc fix: add spaces for readability

* Doc fix: add spaces

* Fix spaces

											
										
										
											2018-12-25 15:25:43 +00:00
+								During request processing, the IN operator assumes that the result of an operation with [NULL](syntax.md) is always equal to `0`, regardless of whether `NULL` is on the right or left side of the operator. `NULL` values are not included in any dataset, do not correspond to each other and cannot be compared.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								Here is an example with the `t_null` table:
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								```
 								┌─x─┬────y─┐
 								│ 1 │ ᴺᵁᴸᴸ │
 								│ 2 │    3 │
 								└───┴──────┘
 								```
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								Running the query `SELECT x FROM t_null WHERE y IN (NULL,3)` gives you the following result:
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								```
 								┌─x─┐
 								│ 2 │
 								└───┘
 								```
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								You can see that the row in which `y = NULL` is thrown out of the query results. This is because ClickHouse can't decide whether `NULL` is included in the `(NULL,3)` set, returns `0` as the result of the operation, and `SELECT` excludes this row from the final output.
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								```
 								SELECT y IN (NULL, 3)
 								FROM t_null
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												 Update of JOIN docs (#3684)

* Update of english version of descriprion of the table function `file`.

* New syntax for ReplacingMergeTree.
Some improvements in text.

* Significantly change article about SummingMergeTree.
Article is restructured, text is changed in many places of the document. New syntax for table creation is described.

* Descriptions of AggregateFunction and AggregatingMergeTree are updated. Russian version.

* New syntax for new syntax of CREATE TABLE

* Added english docs on Aggregating, Replacing and SummingMergeTree.

* CollapsingMergeTree docs. English version.

* 1. Update of CollapsingMergeTree. 2. Minor changes in markup

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatefunction.md

* Update aggregatingmergetree.md

* GraphiteMergeTree docs update.
New syntax for creation of Replicated* tables.
Minor changes in *MergeTree tables creation syntax.

* Markup fix

* Markup and language fixes

* Clarification in the CollapsingMergeTree article

* DOCAPI-4821. Sync between ru and en versions of docs.

* Fixed the ambiguity in geo functions description.

* Example of JOIN in ru docs

* Deleted misinforming example.

* 1. Updated the JOIN clause description.
2. Added the new setting 'join_default_strictness' description.

* Minor fixes in docs.

* Deleted version of ClickHouse from setting. All info about new features are in changelog.

											
										
										
											2018-11-30 13:04:14 +00:00
+								┌─in(y, tuple(NULL, 3))─┐
 								│                     0 │
 								│                     1 │
 								└───────────────────────┘
 								```
-												re-apply useful patch parts to select.md

											
										
										
											2018-09-06 18:11:25 +00:00
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fixes: remove all anchors <a> (#3897)

* Doc fixes: rm anchors <a>

* Doc fixes: rm anchors <a>

* Doc fixes: fix links

* Doc fixes: fix the links

											
										
										
											2018-12-21 19:23:55 +00:00
+								#### Distributed Subqueries {#select-distributed-subqueries}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Doc fixes: remove double placeholders; add them where missing. (#3923)

* Doc fix: add spaces where missing

* Doc fixes: rm double spaces

* Doc fixes: edit spaces

* Doc fixes: rm double spaces in /fa

* Revert "Doc fixes: rm double spaces in /fa"

This reverts commit bb879a62ef5fa965d989fea3b1b2a693d2016a2d.

* Doc fix: resolve all problems with double spaces in /fa

* Doc fix: add spaces for readability

* Doc fix: add spaces

* Fix spaces

											
										
										
											2018-12-25 15:25:43 +00:00
+								There are two options for IN-s with subqueries (similar to JOINs): normal `IN` / `JOIN` and `GLOBAL IN` / `GLOBAL JOIN`. They differ in how they are run for distributed query processing.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												revert some more harmful patches

											
										
										
											2018-09-06 18:07:25 +00:00
+								!!! attention
-												WIP on docs (#3813)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

* Update some links on front page

* Remove some outdated comment

* Add twitter link to front page

* More front page links tuning

* Add Amsterdam meetup link

* Smaller font to avoid second line

* Add Amsterdam link to README.md

* Proper docs nav translation

* Back to 300 font-weight except Chinese

* fix docs build

* Update Amsterdam link

* remove symlinks

* more zh punctuation

* apply lost comment by @zhang2014

* Apply comments by @zhang2014 from #3417

* Remove Beijing link

* rm incorrect symlink

* restore content of docs/zh/operations/table_engines/index.md

* CLICKHOUSE-3751: stem terms while searching docs

* CLICKHOUSE-3751: use English stemmer in non-English docs too

* CLICKHOUSE-4135 fix

* Remove past meetup link

* Add blog link to top nav

* Add ContentSquare article link

* Add form link to front page + refactor some texts

* couple markup fixes

* minor

* Introduce basic ODBC driver page in docs

* More verbose 3rd party libs disclaimer

* Put third-party stuff into a separate folder

* Separate third-party stuff in ToC too

* Update links

* Move stuff that is not really (only) a client library into a separate page

* Add clickhouse-hdfs-loader link

* Some introduction for "interfaces" section

* Rewrite tcp.md

* http_interface.md -> http.md

* fix link

* Remove unconvenient error for now

* try to guess anchor instead of failing

* remove symlink

* Remove outdated info from introduction

* remove ru roadmap.md

* replace ru roadmap.md with symlink

* Update roadmap.md

* lost file

* Title case in toc_en.yml

* Sync "Functions" ToC section with en

* Remove reference to pretty old ClickHouse release from docs

* couple lost symlinks in fa

* Close quote in proper place

* Rewrite en/getting_started/index.md

* Sync en<>ru getting_started/index.md

* minor changes

* Some gui.md refactoring

* Translate DataGrip section to ru

* Translate DataGrip section to zh

* Translate DataGrip section to fa

* Translate DBeaver section to fa

* Translate DBeaver section to zh

* Split third-party GUI to open-source and commercial

* Mention some RDBMS integrations + ad-hoc translation fixes

* Add rel="external nofollow" to outgoing links from docs

* Lost blank lines

* Fix class name

* More rel="external nofollow"

* Apply suggestions by @sundy-li

* Mobile version of front page improvements

* test

* test 2

* test 3

* Update LICENSE

* minor docs fix

* Highlight current article as suggested by @sundy-li

* fix link destination

* Introduce backup.md (only "en" for now)

* Mention INSERT+SELECT in backup.md

* Some improvements for replication.md

* Add backup.md to toc

* Mention clickhouse-backup tool

* Mention LightHouse in third-party GUI list

* Introduce interfaces/third-party/proxy.md

* Add clickhouse-bulk to proxy.md

* Major extension of integrations.md contents

* fix link target

* remove unneeded file

* better toc item name

* fix markdown

* better ru punctuation

* Add yet another possible backup approach

* Simplify copying permalinks to headers

* Support non-eng link anchors in docs + update some deps

* Generate anchors for single-page mode automatically

* Remove anchors to top of pages

* Remove anchors that nobody links to

* build fixes

* fix few links

* restore css

* fix some links

* restore gifs

* fix lost words

* more docs fixes

* docs fixes

* NULL anchor

* update urllib3 dependency

* more fixes

											
										
										
											2018-12-12 17:28:00 +00:00
+								    Remember that the algorithms described below may work differently depending on the [settings](../operations/settings/settings.md) `distributed_product_mode` setting.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								When using the regular IN, the query is sent to remote servers, and each of them runs the subqueries in the `IN` or `JOIN` clause.
-												Doc fixes: remove double placeholders; add them where missing. (#3923)

* Doc fix: add spaces where missing

* Doc fixes: rm double spaces

* Doc fixes: edit spaces

* Doc fixes: rm double spaces in /fa

* Revert "Doc fixes: rm double spaces in /fa"

This reverts commit bb879a62ef5fa965d989fea3b1b2a693d2016a2d.

* Doc fix: resolve all problems with double spaces in /fa

* Doc fix: add spaces for readability

* Doc fix: add spaces

* Fix spaces

											
										
										
											2018-12-25 15:25:43 +00:00
+								When using `GLOBAL IN` / `GLOBAL JOINs`, first all the subqueries are run for `GLOBAL IN` / `GLOBAL JOINs`, and the results are collected in temporary tables. Then the temporary tables are sent to each remote server, where the queries are run using this temporary data.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								For a non-distributed query, use the regular `IN` / `JOIN`.
-												Doc fixes: remove double placeholders; add them where missing. (#3923)

* Doc fix: add spaces where missing

* Doc fixes: rm double spaces

* Doc fixes: edit spaces

* Doc fixes: rm double spaces in /fa

* Revert "Doc fixes: rm double spaces in /fa"

This reverts commit bb879a62ef5fa965d989fea3b1b2a693d2016a2d.

* Doc fix: resolve all problems with double spaces in /fa

* Doc fix: add spaces for readability

* Doc fix: add spaces

* Fix spaces

											
										
										
											2018-12-25 15:25:43 +00:00
+								Be careful when using subqueries in the `IN` / `JOIN` clauses for distributed query processing.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								Let's look at some examples. Assume that each server in the cluster has a normal **local_table**. Each server also has a **distributed_table** table with the **Distributed** type, which looks at all the servers in the cluster.
 								For a query to the **distributed_table**, the query will be sent to all the remote servers and run on them using the **local_table**.
 								For example, the query
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM distributed_table
 								```
 								will be sent to all remote servers as
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM local_table
 								```
 								and run on each of them in parallel, until it reaches the stage where intermediate results can be combined. Then the intermediate results will be returned to the requestor server and merged on it, and the final result will be sent to the client.
 								Now let's examine a query with IN:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM local_table WHERE CounterID = 34)
 								```
 								- Calculation of the intersection of audiences of two sites.
 								This query will be sent to all remote servers as
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM local_table WHERE CounterID = 34)
 								```
 								In other words, the data set in the IN clause will be collected on each server independently, only across the data that is stored locally on each of the servers.
 								This will work correctly and optimally if you are prepared for this case and have spread data across the cluster servers such that the data for a single UserID resides entirely on a single server. In this case, all the necessary data will be available locally on each server. Otherwise, the result will be inaccurate. We refer to this variation of the query as "local IN".
 								To correct how the query works when data is spread randomly across the cluster servers, you could specify **distributed_table** inside a subquery. The query would look like this:
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
 								```
 								This query will be sent to all remote servers as
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
 								```
 								The subquery will begin running on each remote server. Since the subquery uses a distributed table, the subquery that is on each remote server will be resent to every remote server as
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT UserID FROM local_table WHERE CounterID = 34
 								```
 								For example, if you have a cluster of 100 servers, executing the entire query will require 10,000 elementary requests, which is generally considered unacceptable.
 								In such cases, you should always use GLOBAL IN instead of IN. Let's look at how it works for the query
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID GLOBAL IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
 								```
 								The requestor server will run the subquery
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT UserID FROM distributed_table WHERE CounterID = 34
 								```
 								and the result will be put in a temporary table in RAM. Then the request will be sent to each remote server as
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								``` sql
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID GLOBAL IN _data1
 								```
-												English translation is updated.

											
										
										
											2018-04-23 06:20:21 +00:00
+								and the temporary table `_data1` will be sent to every remote server with the query (the name of the temporary table is implementation-defined).
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								This is more optimal than using the normal IN. However, keep the following points in mind:
 . When creating a temporary table, data is not made unique. To reduce the volume of data transmitted over the network, specify DISTINCT in the subquery. (You don't need to do this for a normal IN.)
 . The temporary table will be sent to all the remote servers. Transmission does not account for network topology. For example, if 10 remote servers reside in a datacenter that is very remote in relation to the requestor server, the data will be sent 10 times over the channel to the remote datacenter. Try to avoid large data sets when using GLOBAL IN.
 . When transmitting data to remote servers, restrictions on network bandwidth are not configurable. You might overload the network.
 . Try to distribute data across servers so that you don't need to use GLOBAL IN on a regular basis.
 . If you need to use GLOBAL IN often, plan the location of the ClickHouse cluster so that a single group of replicas resides in no more than one data center with a fast network between them, so that a query can be processed entirely within a single data center.
 								It also makes sense to specify a local table in the `GLOBAL IN` clause, in case this local table is only available on the requestor server and you want to use data from it on remote servers.
-												Capitalize most words in titles of docs/en/

											
										
										
											2018-08-07 16:49:40 +00:00
+								### Extreme Values
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								In addition to results, you can also get minimum and maximum values for the results columns. To do this, set the **extremes** setting to 1. Minimums and maximums are calculated for numeric types, dates, and dates with times. For other columns, the default values are output.
 								An extra two rows are calculated – the minimums and maximums, respectively. These extra two rows are output in JSON\*, TabSeparated\*, and Pretty\* formats, separate from the other rows. They are not output for other formats.
 								In JSON\* formats, the extreme values are output in a separate 'extremes' field. In TabSeparated\* formats, the row comes after the main result, and after 'totals' if present. It is preceded by an empty row (after the other data). In Pretty\* formats, the row is output as a separate table after the main result, and after 'totals' if present.
 								Extreme values are calculated for rows that have passed through LIMIT. However, when using 'LIMIT offset, size', the rows before 'offset' are included in 'extremes'. In stream requests, the result may also include a small number of rows that passed through LIMIT.
 								### Notes
 								The `GROUP BY` and `ORDER BY` clauses do not support positional arguments. This contradicts MySQL, but conforms to standard SQL.
 								For example, `GROUP BY 1, 2` will be interpreted as grouping by constants (i.e. aggregation of all rows into one).
 								You can use synonyms (`AS` aliases) in any part of a query.
 								You can put an asterisk in any part of a query instead of an expression. When the query is analyzed, the asterisk is expanded to a list of all table columns (excluding the `MATERIALIZED` and `ALIAS` columns). There are only a few cases when using an asterisk is justified:
 								- When creating a table dump.
 								- For tables containing just a few columns, such as system tables.
 								- For getting information about what columns are in a table. In this case, set `LIMIT 1`. But it is better to use the `DESC TABLE` query.
 								- When there is strong filtration on a small number of columns using `PREWHERE`.
 								- In subqueries (since columns that aren't needed for the external query are excluded from subqueries).
 								In all other cases, we don't recommend using the asterisk, since it only gives you the drawbacks of a columnar DBMS instead of the advantages. In other words using the asterisk is not recommended.
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
 								[Original article](https://clickhouse.yandex/docs/en/query_language/select/) <!--hide-->