Merge branch 'master' of github.com:yandex/ClickHouse into pyos-llvm-jit

2024-11-22 15:42:02 +00:00 · 2018-05-09 04:55:49 +03:00 · 2018-05-09 04:55:49 +03:00 · 6efcdc5a6c
commit 6efcdc5a6c
parent 6e2028d921 378c06120b
191 changed files with 1923 additions and 1724 deletions
--- a/dbms/src/Core/Field.h
+++ b/dbms/src/Core/Field.h
@ -361,10 +361,19 @@ private:
        switch (field.which)
        {
            case Types::Null:    f(field.template get<Null>());    return;
+
+// gcc 7.3.0
+#if !__clang__
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#endif
            case Types::UInt64:  f(field.template get<UInt64>());  return;
            case Types::UInt128: f(field.template get<UInt128>()); return;
            case Types::Int64:   f(field.template get<Int64>());   return;
            case Types::Float64: f(field.template get<Float64>()); return;
+#if !__clang__
+#pragma GCC diagnostic pop
+#endif
            case Types::String:  f(field.template get<String>());  return;
            case Types::Array:   f(field.template get<Array>());   return;
            case Types::Tuple:   f(field.template get<Tuple>());   return;
--- a/docs/en/agg_functions/combinators.md
+++ b/docs/en/agg_functions/combinators.md
--- a/docs/en/agg_functions/index.md
+++ b/docs/en/agg_functions/index.md
--- a/docs/en/agg_functions/parametric_functions.md
+++ b/docs/en/agg_functions/parametric_functions.md
--- a/docs/en/agg_functions/reference.md
+++ b/docs/en/agg_functions/reference.md
@ -19,7 +19,7 @@ In some cases, you can rely on the order of execution. This applies to cases whe

 When a `SELECT` query has the `GROUP BY` clause or at least one aggregate function, ClickHouse (in contrast to MySQL) requires that all expressions in the `SELECT`, `HAVING`, and `ORDER BY` clauses be calculated from keys or from aggregate functions. In other words, each column selected from the table must be used either in keys or inside aggregate functions. To get behavior like in MySQL, you can put the other columns in the `any` aggregate function.

-## anyHeavy
+## anyHeavy(x)

 Selects a frequently occurring value using the [heavy hitters](http://www.cs.umd.edu/~samir/498/karp.pdf) algorithm. If there is a value that occurs more than in half the cases in each of the query's execution threads, this value is returned. Normally, the result is nondeterministic.

@ -28,7 +28,6 @@ anyHeavy(column)
 ```

 **Arguments**
-
 - `column` – The column name.

 **Example**
@ -39,6 +38,7 @@ Take the [OnTime](../getting_started/example_datasets/ontime.md#example_datasets
 SELECT anyHeavy(AirlineID) AS res
 FROM ontime
 ```
+
 ```
 ┌───res─┐
 │ 19690 │
@ -169,7 +169,7 @@ In some cases, you can still rely on the order of execution. This applies to cas

 <a name="agg_functions_groupArrayInsertAt"></a>

-## groupArrayInsertAt
+## groupArrayInsertAt(x)

 Inserts a value into the array in the specified position.

@ -256,7 +256,7 @@ The performance of the function is lower than for ` quantile`, ` quantileTiming`

 The result depends on the order of running the query, and is nondeterministic.

-## median
+## median(x)

 All the quantile functions have corresponding median functions: `median`, `medianDeterministic`, `medianTiming`, `medianTimingWeighted`, `medianExact`, `medianExactWeighted`, `medianTDigest`. They are synonyms and their behavior is identical.

@ -286,11 +286,11 @@ The result is equal to the square root of `varSamp(x)`.

 The result is equal to the square root of `varPop(x)`.

-## topK
+## topK(N)(column)

 Returns an array of the most frequent values in the specified column. The resulting array is sorted in descending order of frequency of values (not by the values themselves).

-Implements the [ Filtered Space-Saving](http://www.l2f.inesc-id.pt/~fmmb/wiki/uploads/Work/misnis.ref0a.pdf)  algorithm for analyzing TopK, based on the reduce-and-combine algorithm from [Parallel Space Saving](https://arxiv.org/pdf/1401.0702.pdf).
+Implements the [Filtered Space-Saving](http://www.l2f.inesc-id.pt/~fmmb/wiki/uploads/Work/misnis.ref0a.pdf)  algorithm for analyzing TopK, based on the reduce-and-combine algorithm from [Parallel Space Saving](https://arxiv.org/pdf/1401.0702.pdf).

 ```
 topK(N)(column)
@ -301,7 +301,6 @@ This function doesn't provide a guaranteed result. In certain situations, errors
 We recommend using the `N < 10 ` value; performance is reduced with large `N` values. Maximum value of ` N = 65536`.

 **Arguments**
-
 - 'N' is the number of values.
 - ' x ' – The column.

--- a/docs/en/data_types/array.md
+++ b/docs/en/data_types/array.md
--- a/docs/en/data_types/boolean.md
+++ b/docs/en/data_types/boolean.md
--- a/docs/en/data_types/date.md
+++ b/docs/en/data_types/date.md
--- a/docs/en/data_types/datetime.md
+++ b/docs/en/data_types/datetime.md
--- a/docs/en/data_types/enum.md
+++ b/docs/en/data_types/enum.md
--- a/docs/en/data_types/fixedstring.md
+++ b/docs/en/data_types/fixedstring.md
--- a/docs/en/data_types/float.md
+++ b/docs/en/data_types/float.md
@ -5,7 +5,7 @@
 Types are equivalent to types of C:

 - `Float32` - `float`
- `Float64`  - ` double`
+- `Float64`  - `double`

 We recommend that you store data in integer form whenever possible. For example, convert fixed precision numbers to integer values, such as monetary amounts or page load times in milliseconds.

@ -16,7 +16,6 @@ We recommend that you store data in integer form whenever possible. For example,
 ```sql
 SELECT 1 - 0.9
 ```
-
 ```
 ┌───────minus(1, 0.9)─┐
 │ 0.09999999999999998 │
--- a/docs/en/data_types/index.md
+++ b/docs/en/data_types/index.md
--- a/docs/en/data_types/int_uint.md
+++ b/docs/en/data_types/int_uint.md
--- a/docs/en/data_types/nested_data_structures/aggregatefunction.md
+++ b/docs/en/data_types/nested_data_structures/aggregatefunction.md
--- a/docs/en/data_types/nested_data_structures/index.md
+++ b/docs/en/data_types/nested_data_structures/index.md
@ -1 +1,2 @@
 # Nested data structures
+
--- a/docs/en/data_types/nested_data_structures/nested.md
+++ b/docs/en/data_types/nested_data_structures/nested.md
--- a/docs/en/data_types/special_data_types/expression.md
+++ b/docs/en/data_types/special_data_types/expression.md
--- a/docs/en/data_types/special_data_types/index.md
+++ b/docs/en/data_types/special_data_types/index.md
--- a/docs/en/data_types/special_data_types/set.md
+++ b/docs/en/data_types/special_data_types/set.md
--- a/docs/en/data_types/string.md
+++ b/docs/en/data_types/string.md
--- a/docs/en/data_types/tuple.md
+++ b/docs/en/data_types/tuple.md
--- a/docs/en/development/style.md
+++ b/docs/en/development/style.md
--- a/docs/en/dicts/external_dicts.md
+++ b/docs/en/dicts/external_dicts.md
@ -21,11 +21,12 @@ The dictionary config file has the following format:

    <!--Optional element. File name with substitutions-->
    <include_from>/etc/metrika.xml</include_from>
-    
+
+
    <dictionary>
-        <!-- Dictionary configuration -->    
+        <!-- Dictionary configuration -->
    </dictionary>
-            
+
    ...

    <dictionary>
@ -43,11 +44,3 @@ See also "[Functions for working with external dictionaries](../functions/ext_di
 You can convert values for a small dictionary by describing it in a `SELECT` query (see the [transform](../functions/other_functions.md#other_functions-transform) function). This functionality is not related to external dictionaries.

 </div>
-
-```eval_rst
-.. toctree::
-    :glob:
-   
-    external_dicts_dict*
-```
-
--- a/docs/en/dicts/external_dicts_dict.md
+++ b/docs/en/dicts/external_dicts_dict.md
@ -27,7 +27,7 @@ The dictionary configuration has the following structure:
 ```

 - name – The identifier that can be used to access the dictionary. Use the characters `[a-zA-Z0-9_\-]`.
- [source](external_dicts_dict_sources.md/#dicts-external_dicts_dict_sources) — Source of the dictionary .
+- [source](external_dicts_dict_sources.md#dicts-external_dicts_dict_sources) — Source of the dictionary.
 - [layout](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout) — Dictionary layout in memory.
 - [structure](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure) — Structure of the dictionary . A key and attributes that can be retrieved by this key.
 - [lifetime](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime) — Frequency of dictionary updates.
--- a/docs/en/dicts/external_dicts_dict_layout.md
+++ b/docs/en/dicts/external_dicts_dict_layout.md
@ -2,11 +2,11 @@

 # Storing dictionaries in memory

-There are [many different ways](external_dicts_dict_layout#dicts-external_dicts_dict_layout-manner) to store dictionaries in memory.
+There are a [variety of ways](#dicts-external_dicts_dict_layout-manner) to store dictionaries in memory.

-We recommend [flat](external_dicts_dict_layout#dicts-external_dicts_dict_layout-flat), [hashed](external_dicts_dict_layout#dicts-external_dicts_dict_layout-hashed), and [complex_key_hashed](external_dicts_dict_layout#dicts-external_dicts_dict_layout-complex_key_hashed). which provide optimal processing speed.
+We recommend [flat](#dicts-external_dicts_dict_layout-flat), [hashed](#dicts-external_dicts_dict_layout-hashed)and[complex_key_hashed](#dicts-external_dicts_dict_layout-complex_key_hashed). which provide optimal processing speed.

-Caching is not recommended because of potentially poor performance and difficulties in selecting optimal parameters. Read more about this in the "[cache](external_dicts_dict_layout#dicts-external_dicts_dict_layout-cache)" section.
+Caching is not recommended because of potentially poor performance and difficulties in selecting optimal parameters. Read more in the section "[cache](#dicts-external_dicts_dict_layout-cache)".

 There are several ways to improve dictionary performance:

@ -88,7 +88,7 @@ Configuration example:

 ### complex_key_hashed

-This type of storage is designed for use with compound [keys](external_dicts_dict_structure#dicts-external_dicts_dict_structure). It is similar to hashed.
+This type of storage is for use with composite [keys](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure). Similar to `hashed`.

 Configuration example:

@ -109,18 +109,18 @@ This storage method works the same way as hashed and allows using date/time rang
 Example: The table contains discounts for each advertiser in the format:

 ```
-  +---------------+---------------------+-------------------+--------+
-  | advertiser id | discount start date | discount end date | amount |
-  +===============+=====================+===================+========+
-  | 123           | 2015-01-01          | 2015-01-15        | 0.15   |
-  +---------------+---------------------+-------------------+--------+
-  | 123           | 2015-01-16          | 2015-01-31        | 0.25   |
-  +---------------+---------------------+-------------------+--------+
-  | 456           | 2015-01-01          | 2015-01-15        | 0.05   |
-  +---------------+---------------------+-------------------+--------+
+---------------+---------------------+-------------------+--------+
+| advertiser id | discount start date | discount end date | amount |
+===============+=====================+===================+========+
+| 123           | 2015-01-01          | 2015-01-15        | 0.15   |
+---------------+---------------------+-------------------+--------+
+| 123           | 2015-01-16          | 2015-01-31        | 0.25   |
+---------------+---------------------+-------------------+--------+
+| 456           | 2015-01-01          | 2015-01-15        | 0.05   |
+---------------+---------------------+-------------------+--------+
 ```

-To use a sample for date ranges, define `range_min` and `range_max` in [structure](external_dicts_dict_structure#dicts-external_dicts_dict_structure).
+To use a sample for date ranges, define the `range_min` and `range_max` elements in the [structure](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure).

 Example:

@ -140,7 +140,9 @@ Example:

 To work with these dictionaries, you need to pass an additional date argument to the `dictGetT` function:

-    dictGetT('dict_name', 'attr_name', id, date)
+```
+dictGetT('dict_name', 'attr_name', id, date)
+```

 This function returns the value for the specified `id`s and the date range that includes the passed date.

@ -191,13 +193,13 @@ The dictionary is stored in a cache that has a fixed number of cells. These cell

 When searching for a dictionary, the cache is searched first. For each block of data, all keys that are not found in the cache or are outdated are requested from the source using ` SELECT attrs... FROM db.table WHERE id IN (k1, k2, ...)`. The received data is then written to the cache.

-For cache dictionaries, the expiration (lifetime &lt;dicts-external_dicts_dict_lifetime&gt;) of data in the cache can be set. If more time than `lifetime` has passed since loading the data in a cell, the cell's value is not used, and it is re-requested the next time it needs to be used.
+For cache dictionaries, the expiration [lifetime](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime) of data in the cache can be set. If more time than `lifetime` has passed since loading the data in a cell, the cell's value is not used, and it is re-requested the next time it needs to be used.

 This is the least effective of all the ways to store dictionaries. The speed of the cache depends strongly on correct settings and the usage scenario. A cache type dictionary performs well only when the hit rates are high enough (recommended 99% and higher). You can view the average hit rate in the `system.dictionaries` table.

 To improve cache performance, use a subquery with ` LIMIT`, and call the function with the dictionary externally.

-Supported [sources](external_dicts_dict_sources#dicts-external_dicts_dict_sources): MySQL, ClickHouse, executable, HTTP.
+Supported [sources](external_dicts_dict_sources.md#dicts-external_dicts_dict_sources): MySQL, ClickHouse, executable, HTTP.

 Example of settings:

@ -205,7 +207,7 @@ Example of settings:
 <layout>
    <cache>
        <!-- The size of the cache, in number of cells. Rounded up to a power of two. -->
-               <size_in_cells>1000000000</size_in_cells>
+        <size_in_cells>1000000000</size_in_cells>
    </cache>
 </layout>
 ```
@ -227,16 +229,15 @@ Do not use ClickHouse as a source, because it is slow to process queries with ra

 ### complex_key_cache

-This type of storage is designed for use with compound [keys](external_dicts_dict_structure#dicts-external_dicts_dict_structure). Similar to `cache`.
+This type of storage is for use with composite [keys](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure). Similar to `cache`.

 <a name="dicts-external_dicts_dict_layout-ip_trie"></a>

 ### ip_trie

+This type of storage is for mapping network prefixes (IP addresses) to metadata such as ASN.

-The table stores IP prefixes for each key (IP address), which makes it possible to map IP addresses to metadata such as ASN or threat score.
-
-Example: in the table there are prefixes matches to AS number and country:
+Example: The table contains network prefixes and their corresponding AS number and country code:

 ```
  +-----------------+-------+--------+
@ -252,7 +253,7 @@ Example: in the table there are prefixes matches to AS number and country:
  +-----------------+-------+--------+
 ```

-When using such a layout, the structure should have the "key" element.
+When using this type of layout, the structure must have a composite key.

 Example:

@ -277,16 +278,20 @@ Example:
    ...
 ```

-These key must have only one attribute of type String, containing a valid IP prefix. Other types are not yet supported.
+The key must have only one String type attribute that contains an allowed IP prefix. Other types are not supported yet.

-For querying, same functions (dictGetT with tuple) as for complex key dictionaries have to be used:
+For queries, you must use the same functions (`dictGetT` with a tuple) as for dictionaries with composite keys:

-    dictGetT('dict_name', 'attr_name', tuple(ip))
+```
+dictGetT('dict_name', 'attr_name', tuple(ip))
+```

-The function accepts either UInt32 for IPv4 address or FixedString(16) for IPv6 address in wire format:
+The function takes either `UInt32` for IPv4, or `FixedString(16)` for IPv6:

-    dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
+```
+dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
+```

-No other type is supported. The function returns attribute for a prefix matching the given IP address. If there are overlapping prefixes, the most specific one is returned.
+Other types are not supported yet. The function returns the attribute for the prefix that corresponds to this IP address. If there are overlapping prefixes, the most specific one is returned.

-The data is stored currently in a bitwise trie, it has to fit in memory.
+Data is stored in a `trie`. It must completely fit into RAM.
--- a/docs/en/dicts/external_dicts_dict_lifetime.md
+++ b/docs/en/dicts/external_dicts_dict_lifetime.md
--- a/docs/en/dicts/external_dicts_dict_sources.md
+++ b/docs/en/dicts/external_dicts_dict_sources.md
@ -135,7 +135,7 @@ Installing unixODBC and the ODBC driver for PostgreSQL:
 Configuring `/etc/odbc.ini` (or `~/.odbc.ini`):

 ```
-[DEFAULT]
+    [DEFAULT]
    Driver = myconnection

    [myconnection]
@ -159,9 +159,9 @@ The dictionary configuration in ClickHouse:
 <dictionary>
    <name>table_name</name>
    <source>
-        <odbc>
-            <!-- You can specifiy the following parameters in connection_string: -->
-            <!-- DSN=myconnection;UID=username;PWD=password;HOST=127.0.0.1;PORT=5432;DATABASE=my_db -->
+    <odbc>
+        <!-- You can specifiy the following parameters in connection_string: -->
+        <!-- DSN=myconnection;UID=username;PWD=password;HOST=127.0.0.1;PORT=5432;DATABASE=my_db -->
            <connection_string>DSN=myconnection</connection_string>
            <table>postgresql_table</table>
        </odbc>
@ -195,7 +195,7 @@ Ubuntu OS.
 Installing the driver: :

 ```
-sudo apt-get install tdsodbc freetds-bin sqsh
+    sudo apt-get install tdsodbc freetds-bin sqsh
 ```

 Configuring the driver: :
--- a/docs/en/dicts/external_dicts_dict_structure.md
+++ b/docs/en/dicts/external_dicts_dict_structure.md
@ -119,4 +119,3 @@ Configuration fields:
 - `hierarchical` – Hierarchical support. Mirrored to the parent identifier. By default, ` false`.
 - `injective` – Whether the `id -> attribute` image is injective. If ` true`, then you can optimize the ` GROUP BY` clause. By default, `false`.
 - `is_object_id` – Whether the query is executed for a MongoDB document by `ObjectID`.
-
--- a/docs/en/dicts/index.md
+++ b/docs/en/dicts/index.md
--- a/docs/en/dicts/internal_dicts.md
+++ b/docs/en/dicts/internal_dicts.md
--- a/docs/en/formats/capnproto.md
+++ b/docs/en/formats/capnproto.md
--- a/docs/en/formats/csv.md
+++ b/docs/en/formats/csv.md
--- a/docs/en/formats/csvwithnames.md
+++ b/docs/en/formats/csvwithnames.md
--- a/docs/en/formats/index.md
+++ b/docs/en/formats/index.md
--- a/docs/en/formats/json.md
+++ b/docs/en/formats/json.md
@ -39,7 +39,7 @@ SELECT SearchPhrase, count() AS c FROM test.hits GROUP BY SearchPhrase WITH TOTA
                        "c": "1549"
                },
                {
-                        "SearchPhrase": "freeform photo",
+                        "SearchPhrase": "freeform photos",
                        "c": "1480"
                }
        ],
@ -83,4 +83,3 @@ If the query contains GROUP BY, rows_before_limit_at_least is the exact number o

 This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
 See also the JSONEachRow format.
-
--- a/docs/en/formats/jsoncompact.md
+++ b/docs/en/formats/jsoncompact.md
@ -24,7 +24,7 @@ Example:
                ["bathroom interior design", "2166"],
                ["yandex", "1655"],
                ["spring 2014 fashion", "1549"],
-                ["freeform photo", "1480"]
+                ["freeform photos", "1480"]
        ],

        "totals": ["","8873898"],
--- a/docs/en/formats/jsoneachrow.md
+++ b/docs/en/formats/jsoneachrow.md
--- a/docs/en/formats/native.md
+++ b/docs/en/formats/native.md
--- a/docs/en/formats/null.md
+++ b/docs/en/formats/null.md
--- a/docs/en/formats/pretty.md
+++ b/docs/en/formats/pretty.md
--- a/docs/en/formats/prettycompact.md
+++ b/docs/en/formats/prettycompact.md
--- a/docs/en/formats/prettycompactmonoblock.md
+++ b/docs/en/formats/prettycompactmonoblock.md
--- a/docs/en/formats/prettynoescapes.md
+++ b/docs/en/formats/prettynoescapes.md
--- a/docs/en/formats/prettyspace.md
+++ b/docs/en/formats/prettyspace.md
--- a/docs/en/formats/rowbinary.md
+++ b/docs/en/formats/rowbinary.md
--- a/docs/en/formats/tabseparated.md
+++ b/docs/en/formats/tabseparated.md
--- a/docs/en/formats/tabseparatedraw.md
+++ b/docs/en/formats/tabseparatedraw.md
--- a/docs/en/formats/tabseparatedwithnames.md
+++ b/docs/en/formats/tabseparatedwithnames.md
--- a/docs/en/formats/tabseparatedwithnamesandtypes.md
+++ b/docs/en/formats/tabseparatedwithnamesandtypes.md
--- a/docs/en/formats/tskv.md
+++ b/docs/en/formats/tskv.md
--- a/docs/en/formats/values.md
+++ b/docs/en/formats/values.md
--- a/docs/en/formats/vertical.md
+++ b/docs/en/formats/vertical.md
--- a/docs/en/formats/verticalraw.md
+++ b/docs/en/formats/verticalraw.md
@ -1,9 +1,10 @@
 # VerticalRaw

-Differs from `Vertical` format in that the rows are written without escaping.
+Differs from `Vertical` format in that the rows are not escaped.
 This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).

-Samples:
+Examples:
+
 ```
 :) SHOW CREATE TABLE geonames FORMAT VerticalRaw;
 Row 1:
@ -15,8 +16,11 @@ Row 1:
 ──────
 test: string with 'quotes' and   with some special
 characters
+```

-- the same in Vertical format:
+Compare with the Vertical format:
+
+```
 :) SELECT 'string with \'quotes\' and \t with some special \n characters' AS test FORMAT Vertical;
 Row 1:
 ──────
--- a/docs/en/formats/xml.md
+++ b/docs/en/formats/xml.md
@ -35,7 +35,7 @@ XML format is suitable only for output, not for parsing. Example:
                        <field>1549</field>
                </row>
                <row>
-                        <SearchPhrase>freeform photo</SearchPhrase>
+                        <SearchPhrase>freeform photos</SearchPhrase>
                        <field>1480</field>
                </row>
                <row>
@ -69,5 +69,6 @@ Just as for JSON, invalid UTF-8 sequences are changed to the replacement charact

 In string values, the characters `<` and `&` are escaped as `<` and `&`.

-Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`,and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.
+Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`,
+and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.

--- a/docs/en/functions/arithmetic_functions.md
+++ b/docs/en/functions/arithmetic_functions.md
--- a/docs/en/functions/array_functions.md
+++ b/docs/en/functions/array_functions.md
@ -225,7 +225,6 @@ arrayPopFront(array)
 ```sql
 SELECT arrayPopFront([1, 2, 3]) AS res
 ```
-
 ```
 ┌─res───┐
 │ [2,3] │
@ -250,6 +249,7 @@ arrayPushBack(array, single_value)
 ```sql
 SELECT arrayPushBack(['a'], 'b') AS res
 ```
+
 ```
 ┌─res───────┐
 │ ['a','b'] │
@ -274,7 +274,6 @@ arrayPushFront(array, single_value)
 ```sql
 SELECT arrayPushBack(['b'], 'a') AS res
 ```
-
 ```
 ┌─res───────┐
 │ ['a','b'] │
--- a/docs/en/functions/array_join.md
+++ b/docs/en/functions/array_join.md
@ -28,3 +28,4 @@ SELECT arrayJoin([1, 2, 3] AS src) AS dst, 'Hello', src
 │   3 │ Hello     │ [1,2,3] │
 └─────┴───────────┴─────────┘
 ```
+
--- a/docs/en/functions/bit_functions.md
+++ b/docs/en/functions/bit_functions.md
@ -15,3 +15,4 @@ The result type is an integer with bits equal to the maximum bits of its argumen
 ## bitShiftLeft(a, b)

 ## bitShiftRight(a, b)
+
--- a/docs/en/functions/comparison_functions.md
+++ b/docs/en/functions/comparison_functions.md
--- a/docs/en/functions/conditional_functions.md
+++ b/docs/en/functions/conditional_functions.md
--- a/docs/en/functions/date_time_functions.md
+++ b/docs/en/functions/date_time_functions.md
@ -143,7 +143,7 @@ The same as 'today() - 1'.
 ## timeSlot

 Rounds the time to the half hour.
-This function is specific to Yandex.Metrica, since half an hour is the minimum amount of time for breaking a session into two sessions if a counter shows a single user's consecutive pageviews that differ in time by strictly more than this amount. This means that tuples (the counter number, user ID, and time slot) can be used to search for pageviews that are included in the corresponding session.
+This function is specific to Yandex.Metrica, since half an hour is the minimum amount of time for breaking a session into two sessions if a tracking tag shows a single user's consecutive pageviews that differ in time by strictly more than this amount. This means that tuples (the tag ID, user ID, and time slot) can be used to search for pageviews that are included in the corresponding session.

 ## timeSlots(StartTime, Duration)

--- a/docs/en/functions/encoding_functions.md
+++ b/docs/en/functions/encoding_functions.md
--- a/docs/en/functions/ext_dict_functions.md
+++ b/docs/en/functions/ext_dict_functions.md
@ -15,12 +15,9 @@ For information on connecting and configuring external dictionaries, see "[Exter
 ## dictGetUUID

 ## dictGetString
-
 `dictGetT('dict_name', 'attr_name', id)`

- Get the value of the attr_name attribute  from the dict_name dictionary using the 'id' key.
-`dict_name`  and `attr_name`  are constant strings.
-`id`must be UInt64.
+- Get the value of the attr_name attribute  from the dict_name dictionary using the 'id' key.`dict_name`  and `attr_name`  are constant strings.`id`must be UInt64.
 If there is no `id` key in the dictionary, it returns the default value specified in the dictionary description.

 ## dictGetTOrDefault
--- a/docs/en/functions/hash_functions.md
+++ b/docs/en/functions/hash_functions.md
--- a/docs/en/functions/higher_order_functions.md
+++ b/docs/en/functions/higher_order_functions.md
--- a/docs/en/functions/in_functions.md
+++ b/docs/en/functions/in_functions.md
--- a/docs/en/functions/index.md
+++ b/docs/en/functions/index.md
@ -10,7 +10,7 @@ In this section we discuss regular functions. For aggregate functions, see the s

 In contrast to standard SQL, ClickHouse has strong typing. In other words, it doesn't make implicit conversions between types. Each function works for a specific set of types. This means that sometimes you need to use type conversion functions.

-## Сommon subexpression elimination
+## Common subexpression elimination

 All expressions in a query that have the same AST (the same record or same result of syntactic parsing) are considered to have identical values. Such expressions are concatenated and executed once. Identical subqueries are also eliminated this way.

--- a/docs/en/functions/ip_address_functions.md
+++ b/docs/en/functions/ip_address_functions.md
--- a/docs/en/functions/json_functions.md
+++ b/docs/en/functions/json_functions.md
--- a/docs/en/functions/logical_functions.md
+++ b/docs/en/functions/logical_functions.md
@ -11,3 +11,4 @@ Zero as an argument is considered "false," while any non-zero value is considere
 ## not, NOT operator

 ## xor
+
--- a/docs/en/functions/math_functions.md
+++ b/docs/en/functions/math_functions.md
--- a/docs/en/functions/other_functions.md
+++ b/docs/en/functions/other_functions.md
@ -59,8 +59,13 @@ For elements in a nested data structure, the function checks for the existence o

 Allows building a unicode-art diagram.

-`bar (x, min, max, width)` – Draws a band with a width proportional to (x - min) and equal to 'width' characters when x == max.
-`min, max` – Integer constants. The value must fit in Int64.`width` – Constant, positive number, may be a fraction.
+`bar (x, min, max, width)` draws a band with a width proportional to `(x - min)` and equal to `width` characters when `x = max`.
+
+Parameters:
+
+- `x` – Value to display.
+- `min, max` – Integer constants. The value must fit in Int64.
+- `width` – Constant, positive number, may be a fraction.

 The band is drawn with accuracy to one eighth of a symbol.

@ -278,4 +283,3 @@ The inverse function of MACNumToString. If the MAC address has an invalid format
 ## MACStringToOUI(s)

 Accepts a MAC address in the format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form). Returns the first three octets as a UInt64 number. If the MAC address has an invalid format, it returns 0.
-
--- a/docs/en/functions/random_functions.md
+++ b/docs/en/functions/random_functions.md
--- a/docs/en/functions/rounding_functions.md
+++ b/docs/en/functions/rounding_functions.md
--- a/docs/en/functions/splitting_merging_functions.md
+++ b/docs/en/functions/splitting_merging_functions.md
--- a/docs/en/functions/string_functions.md
+++ b/docs/en/functions/string_functions.md
--- a/docs/en/functions/string_replace_functions.md
+++ b/docs/en/functions/string_replace_functions.md
@ -76,3 +76,4 @@ SELECT replaceRegexpAll('Hello, World!', '^', 'here: ') AS res
 │ here: Hello, World! │
 └─────────────────────┘
 ```
+
--- a/docs/en/functions/string_search_functions.md
+++ b/docs/en/functions/string_search_functions.md
@ -5,14 +5,16 @@ The search substring or regular expression must be a constant in all these funct

 ## position(haystack, needle)

-Search for the 'needle' substring in the 'haystack' string.
+Search for the `needle` substring in the `haystack` string.
 Returns the position (in bytes) of the found substring, starting from 1, or returns 0 if the substring was not found.
-It has also chimpanzees.
+
+For case-insensitive search use `positionCaseInsensitive` function.

 ## positionUTF8(haystack, needle)

-The same as 'position', but the position is returned in Unicode code points. Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, it returns some result (it doesn't throw an exception).
-There is also a positionCaseInsensitiveUTF8 function.
+The same as `position`, but the position is returned in Unicode code points. Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, it returns some result (it doesn't throw an exception).
+
+For case-insensitive search use `positionCaseInsensitiveUTF8` function.

 ## match(haystack, pattern)

@ -49,4 +51,3 @@ For other regular expressions, the code is the same as for the 'match' function.
 ## notLike(haystack, pattern), haystack NOT LIKE pattern operator

 The same thing as 'like', but negative.
-
--- a/docs/en/functions/type_conversion_functions.md
+++ b/docs/en/functions/type_conversion_functions.md
--- a/docs/en/functions/url_functions.md
+++ b/docs/en/functions/url_functions.md
--- a/docs/en/functions/ym_dict_functions.md
+++ b/docs/en/functions/ym_dict_functions.md
--- a/docs/en/getting_started/example_datasets/amplab_benchmark.md
+++ b/docs/en/getting_started/example_datasets/amplab_benchmark.md
@ -118,4 +118,3 @@ GROUP BY sourceIP
 ORDER BY totalRevenue DESC
 LIMIT 1
 ```
-
--- a/docs/en/getting_started/example_datasets/criteo.md
+++ b/docs/en/getting_started/example_datasets/criteo.md
--- a/docs/en/getting_started/example_datasets/nyc_taxi.md
+++ b/docs/en/getting_started/example_datasets/nyc_taxi.md
@ -301,14 +301,19 @@ SELECT passenger_count, toYear(pickup_date) AS year, count(*) FROM trips_mergetr
 Q4:

 ```sql
-SELECT passenger_count, toYear(pickup_date) AS year, round(trip_distance) AS distance, count(*)FROM trips_mergetreeGROUP BY passenger_count, year, distanceORDER BY year, count(*) DESC
+SELECT passenger_count, toYear(pickup_date) AS year, round(trip_distance) AS distance, count(*)
+FROM trips_mergetree
+GROUP BY passenger_count, year, distance
+ORDER BY year, count(*) DESC
 ```

 3.593 seconds.

 The following server was used:

-Two Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz, 16 physical kernels total,128 GiB RAM,8x6 TB HD on hardware RAID-5
+Two Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz, 16 physical kernels total,
+128 GiB RAM,
+8x6 TB HD on hardware RAID-5

 Execution time is the best of three runsBut starting from the second run, queries read data from the file system cache. No further caching occurs: the data is read out and processed in each run.

@ -361,4 +366,3 @@ nodes   Q1     Q2     Q3     Q4
  3  0.212  0.438  0.733  1.241
 140  0.028  0.043  0.051  0.072
 ```
-
--- a/docs/en/getting_started/example_datasets/ontime.md
+++ b/docs/en/getting_started/example_datasets/ontime.md
@ -316,4 +316,3 @@ SELECT OriginCityName, DestCityName, count() AS c FROM ontime GROUP BY OriginCit

 SELECT OriginCityName, count() AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;
 ```
-
--- a/docs/en/getting_started/example_datasets/star_schema.md
+++ b/docs/en/getting_started/example_datasets/star_schema.md
@ -82,4 +82,3 @@ Downloading data (change 'customer' to 'customerd' in the distributed version):
 cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INTO customer FORMAT CSV"
 cat lineorder.tbl | clickhouse-client --query "INSERT INTO lineorder FORMAT CSV"
 ```
-
--- a/docs/en/getting_started/example_datasets/wikistat.md
+++ b/docs/en/getting_started/example_datasets/wikistat.md
@ -24,4 +24,3 @@ for i in {2007..2016}; do for j in {01..12}; do echo $i-$j >&2; curl -sSL "http:
 cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pagecounts-raw/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1/')/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1-\2/')/$link; done
 ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done
 ```
-
--- a/docs/en/getting_started/index.md
+++ b/docs/en/getting_started/index.md
@ -16,7 +16,7 @@ The terminal must use UTF-8 encoding (the default in Ubuntu).

 For testing and development, the system can be installed on a single server or on a desktop computer.

-### Installing from packages Debian/Ubuntu
+### Installing from packages for Debian/Ubuntu

 In `/etc/apt/sources.list` (or in a separate `/etc/apt/sources.list.d/clickhouse.list` file), add the repository:

@ -34,8 +34,7 @@ sudo apt-get update
 sudo apt-get install clickhouse-client clickhouse-server
 ```

-You can also download and install packages manually from here:
-<https://repo.yandex.ru/clickhouse/deb/stable/main/>
+You can also download and install packages manually from here: <https://repo.yandex.ru/clickhouse/deb/stable/main/>.

 ClickHouse contains access restriction settings. They are located in the 'users.xml' file (next to 'config.xml').
 By default, access is allowed from anywhere for the 'default' user, without a password. See 'user/default/networks'.
@ -101,8 +100,7 @@ clickhouse-client
 ```

 The default parameters indicate connecting with localhost:9000 on behalf of the user 'default' without a password.
-The client can be used for connecting to a remote server.
-Example:
+The client can be used for connecting to a remote server. Example:

 ```bash
 clickhouse-client --host=example.com
@ -134,3 +132,4 @@ SELECT 1
 **Congratulations, the system works!**

 To continue experimenting, you can try to download from the test data sets.
+
--- a/docs/en/index.md
+++ b/docs/en/index.md
@ -39,7 +39,7 @@ We'll say that the following is true for the OLAP (online analytical processing)
 - Data is updated in fairly large batches (> 1000 rows), not by single rows; or it is not updated at all.
 - Data is added to the DB but is not modified.
 - For reads, quite a large number of rows are extracted from the DB, but only a small subset of columns.
- Tables are "wide", meaning they contain a large number of columns.
+- Tables are "wide," meaning they contain a large number of columns.
 - Queries are relatively rare (usually hundreds of queries per server or less per second).
 - For simple queries, latencies around 50 ms are allowed.
 - Column values are fairly small: numbers and short strings (for example, 60 bytes per URL).
@ -120,3 +120,4 @@ There are two ways to do this:
 This is not done in "normal" databases, because it doesn't make sense when running simple queries. However, there are exceptions. For example, MemSQL uses code generation to reduce latency when processing SQL queries. (For comparison, analytical DBMSs require optimization of throughput, not latency.)

 Note that for CPU efficiency, the query language must be declarative (SQL or MDX), or at least a vector (J, K). The query should only contain implicit loops, allowing for optimization.
+
--- a/docs/en/interfaces/cli.md
+++ b/docs/en/interfaces/cli.md
@ -31,6 +31,7 @@ _EOF

 cat file.csv | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
 ```
+In batch mode, the default data format is TabSeparated. You can set the format in the FORMAT clause of the query.

 By default, you can only process a single query in batch mode. To make multiple queries from a "script," use the --multiquery parameter. This works for all queries except INSERT. Query results are output consecutively without additional separators.
 Similarly, to process a large number of queries, you can run 'clickhouse-client' for each query. Note that it may take tens of milliseconds to launch the 'clickhouse-client' program.
--- a/docs/en/interfaces/http_interface.md
+++ b/docs/en/interfaces/http_interface.md
@ -130,14 +130,13 @@ POST 'http://localhost:8123/?query=DROP TABLE t'

 For successful requests that don't return a data table, an empty response body is returned.

-You can use compression when transmitting data.
+You can use the internal ClickHouse compression format when transmitting data. The compressed data has a non-standard format, and you will need to use the special clickhouse-compressor program to work with it (it is installed with the clickhouse-client package).

-For using ClickHouse internal compression format, and you will need to use the special compressor program to work with it (sudo apt-get install compressor-metrika-yandex).
 If you specified 'compress=1' in the URL, the server will compress the data it sends you.
 If you specified 'decompress=1' in the URL, the server will decompress the same data that you pass in the POST method.

-Also standard gzip-based HTTP compression can be used. To send gzip compressed POST data just add `Content-Encoding: gzip` to request headers, and gzip POST body.
-To get response compressed, you need to add `Accept-Encoding: gzip` to request headers, and turn on ClickHouse setting called `enable_http_compression`.
+It is also possible to use the standard gzip-based HTTP compression. To send a POST request compressed using gzip, append the request header `Content-Encoding: gzip`.
+In order for ClickHouse to compress the response using gzip, you must append `Accept-Encoding: gzip` to the request headers, and enable the ClickHouse setting `enable_http_compression`.

 You can use this to reduce network traffic when transmitting a large amount of data, or for creating dumps that are immediately compressed.

@ -174,7 +173,8 @@ echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @
 ```

 If the user name is not indicated, the username 'default' is used. If the password is not indicated, an empty password is used.
-You can also use the URL parameters to specify any settings for processing a single query, or entire profiles of settings. Example:http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1
+You can also use the URL parameters to specify any settings for processing a single query, or entire profiles of settings. Example:
+http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1

 For more information, see the section "Settings".

@ -194,11 +194,11 @@ $ echo 'SELECT number FROM system.numbers LIMIT 10' | curl 'http://localhost:812

 For information about other parameters, see the section "SET".

-You can use ClickHouse sessions in the HTTP protocol. To do this, you need to specify the `session_id` GET parameter in HTTP request. You can use any alphanumeric string as a session_id. By default session will be timed out after 60 seconds of inactivity. You can change that by setting `default_session_timeout` in server config file, or by adding GET parameter `session_timeout`. You can also check the status of the session by using GET parameter `session_check=1`. When using sessions you can't run 2 queries with the same session_id simultaneously.
+Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you need to add the `session_id` GET parameter to the request. You can use any string as the session ID. By default, the session is terminated after 60 seconds of inactivity. To change this timeout, modify the `default_session_timeout` setting in the server configuration, or add the `session_timeout` GET parameter to the request. To check the session status, use the `session_check=1` parameter. Only one query at a time can be executed within a single session.

-You can get the progress of query execution in X-ClickHouse-Progress headers, by enabling setting send_progress_in_http_headers.
+You have the option to receive information about the progress of query execution in X-ClickHouse-Progress headers. To do this, enable the setting send_progress_in_http_headers.

-Running query are not aborted automatically after closing HTTP connection. Parsing and data formatting are performed on the server side, and using the network might be ineffective.
+Running requests don't stop automatically if the HTTP connection is lost. Parsing and data formatting are performed on the server side, and using the network might be ineffective.
 The optional 'query_id' parameter can be passed as the query ID (any string). For more information, see the section "Settings, replace_running_query".

 The optional 'quota_key' parameter can be passed as the quota key (any string). For more information, see the section "Quotas".
@ -220,3 +220,4 @@ curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wa
 ```

 Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client side, the error can only be detected at the parsing stage.
+
--- a/docs/en/interfaces/index.md
+++ b/docs/en/interfaces/index.md
--- a/docs/en/interfaces/jdbc.md
+++ b/docs/en/interfaces/jdbc.md
--- a/docs/en/interfaces/tcp.md
+++ b/docs/en/interfaces/tcp.md
--- a/docs/en/interfaces/third-party_client_libraries.md
+++ b/docs/en/interfaces/third-party_client_libraries.md
@ -3,41 +3,40 @@
 There are libraries for working with ClickHouse for:

 - Python
-   - [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm)
-   - [sqlalchemy-clickhouse](https://github.com/cloudflare/sqlalchemy-clickhouse)
-   - [clickhouse-driver](https://github.com/mymarilyn/clickhouse-driver)
-   - [clickhouse-client](https://github.com/yurial/clickhouse-client)
+    - [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm)
+    - [sqlalchemy-clickhouse](https://github.com/cloudflare/sqlalchemy-clickhouse)
+    - [clickhouse-driver](https://github.com/mymarilyn/clickhouse-driver)
+    - [clickhouse-client](https://github.com/yurial/clickhouse-client)
 - PHP
-   - [clickhouse-php-client](https://github.com/8bitov/clickhouse-php-client)
-   - [PhpClickHouseClient](https://github.com/SevaCode/PhpClickHouseClient)
-   - [phpClickHouse](https://github.com/smi2/phpClickHouse)
-   - [clickhouse-client](https://github.com/bozerkins/clickhouse-client)
+    - [clickhouse-php-client](https://github.com/8bitov/clickhouse-php-client)
+    - [PhpClickHouseClient](https://github.com/SevaCode/PhpClickHouseClient)
+    - [phpClickHouse](https://github.com/smi2/phpClickHouse)
+    - [clickhouse-client](https://github.com/bozerkins/clickhouse-client)
 - Go
-   - [clickhouse](https://github.com/kshvakov/clickhouse/)
-   - [go-clickhouse](https://github.com/roistat/go-clickhouse)
-   - [mailrugo-clickhouse](https://github.com/mailru/go-clickhouse)
-   - [golang-clickhouse](https://github.com/leprosus/golang-clickhouse)
+    - [clickhouse](https://github.com/kshvakov/clickhouse/)
+    - [go-clickhouse](https://github.com/roistat/go-clickhouse)
+    - [mailrugo-clickhouse](https://github.com/mailru/go-clickhouse)
+    - [golang-clickhouse](https://github.com/leprosus/golang-clickhouse)
 - NodeJs
-   - [clickhouse (NodeJs)](https://github.com/TimonKK/clickhouse)
-   - [node-clickhouse](https://github.com/apla/node-clickhouse)
+    - [clickhouse (NodeJs)](https://github.com/TimonKK/clickhouse)
+    - [node-clickhouse](https://github.com/apla/node-clickhouse)
 - Perl
-   - [perl-DBD-ClickHouse](https://github.com/elcamlost/perl-DBD-ClickHouse)
-   - [HTTP-ClickHouse](https://metacpan.org/release/HTTP-ClickHouse)
-   - [AnyEvent-ClickHouse](https://metacpan.org/release/AnyEvent-ClickHouse)
+    - [perl-DBD-ClickHouse](https://github.com/elcamlost/perl-DBD-ClickHouse)
+    - [HTTP-ClickHouse](https://metacpan.org/release/HTTP-ClickHouse)
+    - [AnyEvent-ClickHouse](https://metacpan.org/release/AnyEvent-ClickHouse)
 - Ruby
-   - [clickhouse (Ruby)](https://github.com/archan937/clickhouse)
+    - [clickhouse (Ruby)](https://github.com/archan937/clickhouse)
 - R
-   - [clickhouse-r](https://github.com/hannesmuehleisen/clickhouse-r)
-   - [RClickhouse](https://github.com/IMSMWU/RClickhouse)
+    - [clickhouse-r](https://github.com/hannesmuehleisen/clickhouse-r)
+    - [RClickhouse](https://github.com/IMSMWU/RClickhouse)
 - .NET
-   - [ClickHouse-Net](https://github.com/killwort/ClickHouse-Net)
+    - [ClickHouse-Net](https://github.com/killwort/ClickHouse-Net)
 - C++
-   - [clickhouse-cpp](https://github.com/artpaul/clickhouse-cpp/)
+    - [clickhouse-cpp](https://github.com/artpaul/clickhouse-cpp/)
 - Elixir
-  - [clickhousex](https://github.com/appodeal/clickhousex/)
-  - [clickhouse_ecto](https://github.com/appodeal/clickhouse_ecto)
+    - [clickhousex](https://github.com/appodeal/clickhousex/)
+    - [clickhouse_ecto](https://github.com/appodeal/clickhouse_ecto)
 - Java
-  - [clickhouse-client-java](https://github.com/VirtusAI/clickhouse-client-java)
+    - [clickhouse-client-java](https://github.com/VirtusAI/clickhouse-client-java)

 We have not tested these libraries. They are listed in random order.
-
--- a/docs/en/interfaces/third-party_gui.md
+++ b/docs/en/interfaces/third-party_gui.md
--- a/docs/en/introduction/distinctive_features.md
+++ b/docs/en/introduction/distinctive_features.md
--- a/docs/en/introduction/features_considered_disadvantages.md
+++ b/docs/en/introduction/features_considered_disadvantages.md
--- a/docs/en/introduction/index.md
+++ b/docs/en/introduction/index.md
--- a/Show More
+++ b/Show More