Merge pull request #2265 from BayoNet/master

Update of english translation.
This commit is contained in:
alexey-milovidov 2018-05-08 18:40:11 +03:00 committed by GitHub
commit 378c06120b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
190 changed files with 1914 additions and 1724 deletions

0
docs/en/agg_functions/combinators.md Executable file → Normal file
View File

0
docs/en/agg_functions/index.md Executable file → Normal file
View File

0
docs/en/agg_functions/parametric_functions.md Executable file → Normal file
View File

13
docs/en/agg_functions/reference.md Executable file → Normal file
View File

@ -19,7 +19,7 @@ In some cases, you can rely on the order of execution. This applies to cases whe
When a `SELECT` query has the `GROUP BY` clause or at least one aggregate function, ClickHouse (in contrast to MySQL) requires that all expressions in the `SELECT`, `HAVING`, and `ORDER BY` clauses be calculated from keys or from aggregate functions. In other words, each column selected from the table must be used either in keys or inside aggregate functions. To get behavior like in MySQL, you can put the other columns in the `any` aggregate function.
## anyHeavy
## anyHeavy(x)
Selects a frequently occurring value using the [heavy hitters](http://www.cs.umd.edu/~samir/498/karp.pdf) algorithm. If there is a value that occurs more than in half the cases in each of the query's execution threads, this value is returned. Normally, the result is nondeterministic.
@ -28,7 +28,6 @@ anyHeavy(column)
```
**Arguments**
- `column` The column name.
**Example**
@ -39,6 +38,7 @@ Take the [OnTime](../getting_started/example_datasets/ontime.md#example_datasets
SELECT anyHeavy(AirlineID) AS res
FROM ontime
```
```
┌───res─┐
│ 19690 │
@ -169,7 +169,7 @@ In some cases, you can still rely on the order of execution. This applies to cas
<a name="agg_functions_groupArrayInsertAt"></a>
## groupArrayInsertAt
## groupArrayInsertAt(x)
Inserts a value into the array in the specified position.
@ -256,7 +256,7 @@ The performance of the function is lower than for ` quantile`, ` quantileTiming`
The result depends on the order of running the query, and is nondeterministic.
## median
## median(x)
All the quantile functions have corresponding median functions: `median`, `medianDeterministic`, `medianTiming`, `medianTimingWeighted`, `medianExact`, `medianExactWeighted`, `medianTDigest`. They are synonyms and their behavior is identical.
@ -286,11 +286,11 @@ The result is equal to the square root of `varSamp(x)`.
The result is equal to the square root of `varPop(x)`.
## topK
## topK(N)(column)
Returns an array of the most frequent values in the specified column. The resulting array is sorted in descending order of frequency of values (not by the values themselves).
Implements the [ Filtered Space-Saving](http://www.l2f.inesc-id.pt/~fmmb/wiki/uploads/Work/misnis.ref0a.pdf) algorithm for analyzing TopK, based on the reduce-and-combine algorithm from [Parallel Space Saving](https://arxiv.org/pdf/1401.0702.pdf).
Implements the [Filtered Space-Saving](http://www.l2f.inesc-id.pt/~fmmb/wiki/uploads/Work/misnis.ref0a.pdf) algorithm for analyzing TopK, based on the reduce-and-combine algorithm from [Parallel Space Saving](https://arxiv.org/pdf/1401.0702.pdf).
```
topK(N)(column)
@ -301,7 +301,6 @@ This function doesn't provide a guaranteed result. In certain situations, errors
We recommend using the `N < 10 ` value; performance is reduced with large `N` values. Maximum value of ` N = 65536`.
**Arguments**
- 'N' is the number of values.
- ' x ' The column.

0
docs/en/data_types/array.md Executable file → Normal file
View File

0
docs/en/data_types/boolean.md Executable file → Normal file
View File

0
docs/en/data_types/date.md Executable file → Normal file
View File

0
docs/en/data_types/datetime.md Executable file → Normal file
View File

0
docs/en/data_types/enum.md Executable file → Normal file
View File

0
docs/en/data_types/fixedstring.md Executable file → Normal file
View File

3
docs/en/data_types/float.md Executable file → Normal file
View File

@ -5,7 +5,7 @@
Types are equivalent to types of C:
- `Float32` - `float`
- `Float64` - ` double`
- `Float64` - `double`
We recommend that you store data in integer form whenever possible. For example, convert fixed precision numbers to integer values, such as monetary amounts or page load times in milliseconds.
@ -16,7 +16,6 @@ We recommend that you store data in integer form whenever possible. For example,
```sql
SELECT 1 - 0.9
```
```
┌───────minus(1, 0.9)─┐
│ 0.09999999999999998 │

0
docs/en/data_types/index.md Executable file → Normal file
View File

0
docs/en/data_types/int_uint.md Executable file → Normal file
View File

View File

1
docs/en/data_types/nested_data_structures/index.md Executable file → Normal file
View File

@ -1 +1,2 @@
# Nested data structures

0
docs/en/data_types/nested_data_structures/nested.md Executable file → Normal file
View File

0
docs/en/data_types/special_data_types/expression.md Executable file → Normal file
View File

0
docs/en/data_types/special_data_types/index.md Executable file → Normal file
View File

0
docs/en/data_types/special_data_types/set.md Executable file → Normal file
View File

0
docs/en/data_types/string.md Executable file → Normal file
View File

0
docs/en/data_types/tuple.md Executable file → Normal file
View File

1037
docs/en/development/style.md Executable file → Normal file

File diff suppressed because it is too large Load Diff

15
docs/en/dicts/external_dicts.md Executable file → Normal file
View File

@ -21,11 +21,12 @@ The dictionary config file has the following format:
<!--Optional element. File name with substitutions-->
<include_from>/etc/metrika.xml</include_from>
<dictionary>
<!-- Dictionary configuration -->
<!-- Dictionary configuration -->
</dictionary>
...
<dictionary>
@ -43,11 +44,3 @@ See also "[Functions for working with external dictionaries](../functions/ext_di
You can convert values for a small dictionary by describing it in a `SELECT` query (see the [transform](../functions/other_functions.md#other_functions-transform) function). This functionality is not related to external dictionaries.
</div>
```eval_rst
.. toctree::
:glob:
external_dicts_dict*
```

2
docs/en/dicts/external_dicts_dict.md Executable file → Normal file
View File

@ -27,7 +27,7 @@ The dictionary configuration has the following structure:
```
- name The identifier that can be used to access the dictionary. Use the characters `[a-zA-Z0-9_\-]`.
- [source](external_dicts_dict_sources.md/#dicts-external_dicts_dict_sources) — Source of the dictionary .
- [source](external_dicts_dict_sources.md#dicts-external_dicts_dict_sources) — Source of the dictionary.
- [layout](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout) — Dictionary layout in memory.
- [structure](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure) — Structure of the dictionary . A key and attributes that can be retrieved by this key.
- [lifetime](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime) — Frequency of dictionary updates.

65
docs/en/dicts/external_dicts_dict_layout.md Executable file → Normal file
View File

@ -2,11 +2,11 @@
# Storing dictionaries in memory
There are [many different ways](external_dicts_dict_layout#dicts-external_dicts_dict_layout-manner) to store dictionaries in memory.
There are a [variety of ways](#dicts-external_dicts_dict_layout-manner) to store dictionaries in memory.
We recommend [flat](external_dicts_dict_layout#dicts-external_dicts_dict_layout-flat), [hashed](external_dicts_dict_layout#dicts-external_dicts_dict_layout-hashed), and [complex_key_hashed](external_dicts_dict_layout#dicts-external_dicts_dict_layout-complex_key_hashed). which provide optimal processing speed.
We recommend [flat](#dicts-external_dicts_dict_layout-flat), [hashed](#dicts-external_dicts_dict_layout-hashed)and[complex_key_hashed](#dicts-external_dicts_dict_layout-complex_key_hashed). which provide optimal processing speed.
Caching is not recommended because of potentially poor performance and difficulties in selecting optimal parameters. Read more about this in the "[cache](external_dicts_dict_layout#dicts-external_dicts_dict_layout-cache)" section.
Caching is not recommended because of potentially poor performance and difficulties in selecting optimal parameters. Read more in the section "[cache](#dicts-external_dicts_dict_layout-cache)".
There are several ways to improve dictionary performance:
@ -88,7 +88,7 @@ Configuration example:
### complex_key_hashed
This type of storage is designed for use with compound [keys](external_dicts_dict_structure#dicts-external_dicts_dict_structure). It is similar to hashed.
This type of storage is for use with composite [keys](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure). Similar to `hashed`.
Configuration example:
@ -109,18 +109,18 @@ This storage method works the same way as hashed and allows using date/time rang
Example: The table contains discounts for each advertiser in the format:
```
+---------------+---------------------+-------------------+--------+
| advertiser id | discount start date | discount end date | amount |
+===============+=====================+===================+========+
| 123 | 2015-01-01 | 2015-01-15 | 0.15 |
+---------------+---------------------+-------------------+--------+
| 123 | 2015-01-16 | 2015-01-31 | 0.25 |
+---------------+---------------------+-------------------+--------+
| 456 | 2015-01-01 | 2015-01-15 | 0.05 |
+---------------+---------------------+-------------------+--------+
+---------------+---------------------+-------------------+--------+
| advertiser id | discount start date | discount end date | amount |
+===============+=====================+===================+========+
| 123 | 2015-01-01 | 2015-01-15 | 0.15 |
+---------------+---------------------+-------------------+--------+
| 123 | 2015-01-16 | 2015-01-31 | 0.25 |
+---------------+---------------------+-------------------+--------+
| 456 | 2015-01-01 | 2015-01-15 | 0.05 |
+---------------+---------------------+-------------------+--------+
```
To use a sample for date ranges, define `range_min` and `range_max` in [structure](external_dicts_dict_structure#dicts-external_dicts_dict_structure).
To use a sample for date ranges, define the `range_min` and `range_max` elements in the [structure](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure).
Example:
@ -140,7 +140,9 @@ Example:
To work with these dictionaries, you need to pass an additional date argument to the `dictGetT` function:
dictGetT('dict_name', 'attr_name', id, date)
```
dictGetT('dict_name', 'attr_name', id, date)
```
This function returns the value for the specified `id`s and the date range that includes the passed date.
@ -191,13 +193,13 @@ The dictionary is stored in a cache that has a fixed number of cells. These cell
When searching for a dictionary, the cache is searched first. For each block of data, all keys that are not found in the cache or are outdated are requested from the source using ` SELECT attrs... FROM db.table WHERE id IN (k1, k2, ...)`. The received data is then written to the cache.
For cache dictionaries, the expiration (lifetime &lt;dicts-external_dicts_dict_lifetime&gt;) of data in the cache can be set. If more time than `lifetime` has passed since loading the data in a cell, the cell's value is not used, and it is re-requested the next time it needs to be used.
For cache dictionaries, the expiration [lifetime](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime) of data in the cache can be set. If more time than `lifetime` has passed since loading the data in a cell, the cell's value is not used, and it is re-requested the next time it needs to be used.
This is the least effective of all the ways to store dictionaries. The speed of the cache depends strongly on correct settings and the usage scenario. A cache type dictionary performs well only when the hit rates are high enough (recommended 99% and higher). You can view the average hit rate in the `system.dictionaries` table.
To improve cache performance, use a subquery with ` LIMIT`, and call the function with the dictionary externally.
Supported [sources](external_dicts_dict_sources#dicts-external_dicts_dict_sources): MySQL, ClickHouse, executable, HTTP.
Supported [sources](external_dicts_dict_sources.md#dicts-external_dicts_dict_sources): MySQL, ClickHouse, executable, HTTP.
Example of settings:
@ -205,7 +207,7 @@ Example of settings:
<layout>
<cache>
<!-- The size of the cache, in number of cells. Rounded up to a power of two. -->
<size_in_cells>1000000000</size_in_cells>
<size_in_cells>1000000000</size_in_cells>
</cache>
</layout>
```
@ -227,16 +229,15 @@ Do not use ClickHouse as a source, because it is slow to process queries with ra
### complex_key_cache
This type of storage is designed for use with compound [keys](external_dicts_dict_structure#dicts-external_dicts_dict_structure). Similar to `cache`.
This type of storage is for use with composite [keys](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure). Similar to `cache`.
<a name="dicts-external_dicts_dict_layout-ip_trie"></a>
### ip_trie
This type of storage is for mapping network prefixes (IP addresses) to metadata such as ASN.
The table stores IP prefixes for each key (IP address), which makes it possible to map IP addresses to metadata such as ASN or threat score.
Example: in the table there are prefixes matches to AS number and country:
Example: The table contains network prefixes and their corresponding AS number and country code:
```
+-----------------+-------+--------+
@ -252,7 +253,7 @@ Example: in the table there are prefixes matches to AS number and country:
+-----------------+-------+--------+
```
When using such a layout, the structure should have the "key" element.
When using this type of layout, the structure must have a composite key.
Example:
@ -277,16 +278,20 @@ Example:
...
```
These key must have only one attribute of type String, containing a valid IP prefix. Other types are not yet supported.
The key must have only one String type attribute that contains an allowed IP prefix. Other types are not supported yet.
For querying, same functions (dictGetT with tuple) as for complex key dictionaries have to be used:
For queries, you must use the same functions (`dictGetT` with a tuple) as for dictionaries with composite keys:
dictGetT('dict_name', 'attr_name', tuple(ip))
```
dictGetT('dict_name', 'attr_name', tuple(ip))
```
The function accepts either UInt32 for IPv4 address or FixedString(16) for IPv6 address in wire format:
The function takes either `UInt32` for IPv4, or `FixedString(16)` for IPv6:
dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
```
dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
```
No other type is supported. The function returns attribute for a prefix matching the given IP address. If there are overlapping prefixes, the most specific one is returned.
Other types are not supported yet. The function returns the attribute for the prefix that corresponds to this IP address. If there are overlapping prefixes, the most specific one is returned.
The data is stored currently in a bitwise trie, it has to fit in memory.
Data is stored in a `trie`. It must completely fit into RAM.

0
docs/en/dicts/external_dicts_dict_lifetime.md Executable file → Normal file
View File

10
docs/en/dicts/external_dicts_dict_sources.md Executable file → Normal file
View File

@ -135,7 +135,7 @@ Installing unixODBC and the ODBC driver for PostgreSQL:
Configuring `/etc/odbc.ini` (or `~/.odbc.ini`):
```
[DEFAULT]
[DEFAULT]
Driver = myconnection
[myconnection]
@ -159,9 +159,9 @@ The dictionary configuration in ClickHouse:
<dictionary>
<name>table_name</name>
<source>
<odbc>
<!-- You can specifiy the following parameters in connection_string: -->
<!-- DSN=myconnection;UID=username;PWD=password;HOST=127.0.0.1;PORT=5432;DATABASE=my_db -->
<odbc>
<!-- You can specifiy the following parameters in connection_string: -->
<!-- DSN=myconnection;UID=username;PWD=password;HOST=127.0.0.1;PORT=5432;DATABASE=my_db -->
<connection_string>DSN=myconnection</connection_string>
<table>postgresql_table</table>
</odbc>
@ -195,7 +195,7 @@ Ubuntu OS.
Installing the driver: :
```
sudo apt-get install tdsodbc freetds-bin sqsh
sudo apt-get install tdsodbc freetds-bin sqsh
```
Configuring the driver: :

1
docs/en/dicts/external_dicts_dict_structure.md Executable file → Normal file
View File

@ -119,4 +119,3 @@ Configuration fields:
- `hierarchical` Hierarchical support. Mirrored to the parent identifier. By default, ` false`.
- `injective` Whether the `id -> attribute` image is injective. If ` true`, then you can optimize the ` GROUP BY` clause. By default, `false`.
- `is_object_id` Whether the query is executed for a MongoDB document by `ObjectID`.

0
docs/en/dicts/index.md Executable file → Normal file
View File

0
docs/en/dicts/internal_dicts.md Executable file → Normal file
View File

0
docs/en/formats/capnproto.md Executable file → Normal file
View File

0
docs/en/formats/csv.md Executable file → Normal file
View File

0
docs/en/formats/csvwithnames.md Executable file → Normal file
View File

0
docs/en/formats/index.md Executable file → Normal file
View File

3
docs/en/formats/json.md Executable file → Normal file
View File

@ -39,7 +39,7 @@ SELECT SearchPhrase, count() AS c FROM test.hits GROUP BY SearchPhrase WITH TOTA
"c": "1549"
},
{
"SearchPhrase": "freeform photo",
"SearchPhrase": "freeform photos",
"c": "1480"
}
],
@ -83,4 +83,3 @@ If the query contains GROUP BY, rows_before_limit_at_least is the exact number o
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
See also the JSONEachRow format.

2
docs/en/formats/jsoncompact.md Executable file → Normal file
View File

@ -24,7 +24,7 @@ Example:
["bathroom interior design", "2166"],
["yandex", "1655"],
["spring 2014 fashion", "1549"],
["freeform photo", "1480"]
["freeform photos", "1480"]
],
"totals": ["","8873898"],

0
docs/en/formats/jsoneachrow.md Executable file → Normal file
View File

0
docs/en/formats/native.md Executable file → Normal file
View File

0
docs/en/formats/null.md Executable file → Normal file
View File

0
docs/en/formats/pretty.md Executable file → Normal file
View File

0
docs/en/formats/prettycompact.md Executable file → Normal file
View File

0
docs/en/formats/prettycompactmonoblock.md Executable file → Normal file
View File

0
docs/en/formats/prettynoescapes.md Executable file → Normal file
View File

0
docs/en/formats/prettyspace.md Executable file → Normal file
View File

0
docs/en/formats/rowbinary.md Executable file → Normal file
View File

0
docs/en/formats/tabseparated.md Executable file → Normal file
View File

0
docs/en/formats/tabseparatedraw.md Executable file → Normal file
View File

0
docs/en/formats/tabseparatedwithnames.md Executable file → Normal file
View File

0
docs/en/formats/tabseparatedwithnamesandtypes.md Executable file → Normal file
View File

0
docs/en/formats/tskv.md Executable file → Normal file
View File

0
docs/en/formats/values.md Executable file → Normal file
View File

0
docs/en/formats/vertical.md Executable file → Normal file
View File

View File

@ -1,9 +1,10 @@
# VerticalRaw
Differs from `Vertical` format in that the rows are written without escaping.
Differs from `Vertical` format in that the rows are not escaped.
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
Samples:
Examples:
```
:) SHOW CREATE TABLE geonames FORMAT VerticalRaw;
Row 1:
@ -15,8 +16,11 @@ Row 1:
──────
test: string with 'quotes' and with some special
characters
```
-- the same in Vertical format:
Compare with the Vertical format:
```
:) SELECT 'string with \'quotes\' and \t with some special \n characters' AS test FORMAT Vertical;
Row 1:
──────

5
docs/en/formats/xml.md Executable file → Normal file
View File

@ -35,7 +35,7 @@ XML format is suitable only for output, not for parsing. Example:
<field>1549</field>
</row>
<row>
<SearchPhrase>freeform photo</SearchPhrase>
<SearchPhrase>freeform photos</SearchPhrase>
<field>1480</field>
</row>
<row>
@ -69,5 +69,6 @@ Just as for JSON, invalid UTF-8 sequences are changed to the replacement charact
In string values, the characters `<` and `&` are escaped as `<` and `&`.
Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`,and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.
Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`,
and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.

0
docs/en/functions/arithmetic_functions.md Executable file → Normal file
View File

3
docs/en/functions/array_functions.md Executable file → Normal file
View File

@ -225,7 +225,6 @@ arrayPopFront(array)
```sql
SELECT arrayPopFront([1, 2, 3]) AS res
```
```
┌─res───┐
│ [2,3] │
@ -250,6 +249,7 @@ arrayPushBack(array, single_value)
```sql
SELECT arrayPushBack(['a'], 'b') AS res
```
```
┌─res───────┐
│ ['a','b'] │
@ -274,7 +274,6 @@ arrayPushFront(array, single_value)
```sql
SELECT arrayPushBack(['b'], 'a') AS res
```
```
┌─res───────┐
│ ['a','b'] │

1
docs/en/functions/array_join.md Executable file → Normal file
View File

@ -28,3 +28,4 @@ SELECT arrayJoin([1, 2, 3] AS src) AS dst, 'Hello', src
│ 3 │ Hello │ [1,2,3] │
└─────┴───────────┴─────────┘
```

1
docs/en/functions/bit_functions.md Executable file → Normal file
View File

@ -15,3 +15,4 @@ The result type is an integer with bits equal to the maximum bits of its argumen
## bitShiftLeft(a, b)
## bitShiftRight(a, b)

0
docs/en/functions/comparison_functions.md Executable file → Normal file
View File

0
docs/en/functions/conditional_functions.md Executable file → Normal file
View File

2
docs/en/functions/date_time_functions.md Executable file → Normal file
View File

@ -143,7 +143,7 @@ The same as 'today() - 1'.
## timeSlot
Rounds the time to the half hour.
This function is specific to Yandex.Metrica, since half an hour is the minimum amount of time for breaking a session into two sessions if a counter shows a single user's consecutive pageviews that differ in time by strictly more than this amount. This means that tuples (the counter number, user ID, and time slot) can be used to search for pageviews that are included in the corresponding session.
This function is specific to Yandex.Metrica, since half an hour is the minimum amount of time for breaking a session into two sessions if a tracking tag shows a single user's consecutive pageviews that differ in time by strictly more than this amount. This means that tuples (the tag ID, user ID, and time slot) can be used to search for pageviews that are included in the corresponding session.
## timeSlots(StartTime, Duration)

0
docs/en/functions/encoding_functions.md Executable file → Normal file
View File

5
docs/en/functions/ext_dict_functions.md Executable file → Normal file
View File

@ -15,12 +15,9 @@ For information on connecting and configuring external dictionaries, see "[Exter
## dictGetUUID
## dictGetString
`dictGetT('dict_name', 'attr_name', id)`
- Get the value of the attr_name attribute from the dict_name dictionary using the 'id' key.
`dict_name` and `attr_name` are constant strings.
`id`must be UInt64.
- Get the value of the attr_name attribute from the dict_name dictionary using the 'id' key.`dict_name` and `attr_name` are constant strings.`id`must be UInt64.
If there is no `id` key in the dictionary, it returns the default value specified in the dictionary description.
## dictGetTOrDefault

0
docs/en/functions/hash_functions.md Executable file → Normal file
View File

0
docs/en/functions/higher_order_functions.md Executable file → Normal file
View File

0
docs/en/functions/in_functions.md Executable file → Normal file
View File

2
docs/en/functions/index.md Executable file → Normal file
View File

@ -10,7 +10,7 @@ In this section we discuss regular functions. For aggregate functions, see the s
In contrast to standard SQL, ClickHouse has strong typing. In other words, it doesn't make implicit conversions between types. Each function works for a specific set of types. This means that sometimes you need to use type conversion functions.
## Сommon subexpression elimination
## Common subexpression elimination
All expressions in a query that have the same AST (the same record or same result of syntactic parsing) are considered to have identical values. Such expressions are concatenated and executed once. Identical subqueries are also eliminated this way.

0
docs/en/functions/ip_address_functions.md Executable file → Normal file
View File

0
docs/en/functions/json_functions.md Executable file → Normal file
View File

1
docs/en/functions/logical_functions.md Executable file → Normal file
View File

@ -11,3 +11,4 @@ Zero as an argument is considered "false," while any non-zero value is considere
## not, NOT operator
## xor

0
docs/en/functions/math_functions.md Executable file → Normal file
View File

10
docs/en/functions/other_functions.md Executable file → Normal file
View File

@ -59,8 +59,13 @@ For elements in a nested data structure, the function checks for the existence o
Allows building a unicode-art diagram.
`bar (x, min, max, width)` Draws a band with a width proportional to (x - min) and equal to 'width' characters when x == max.
`min, max` Integer constants. The value must fit in Int64.`width` Constant, positive number, may be a fraction.
`bar (x, min, max, width)` draws a band with a width proportional to `(x - min)` and equal to `width` characters when `x = max`.
Parameters:
- `x` Value to display.
- `min, max` Integer constants. The value must fit in Int64.
- `width` Constant, positive number, may be a fraction.
The band is drawn with accuracy to one eighth of a symbol.
@ -278,4 +283,3 @@ The inverse function of MACNumToString. If the MAC address has an invalid format
## MACStringToOUI(s)
Accepts a MAC address in the format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form). Returns the first three octets as a UInt64 number. If the MAC address has an invalid format, it returns 0.

0
docs/en/functions/random_functions.md Executable file → Normal file
View File

0
docs/en/functions/rounding_functions.md Executable file → Normal file
View File

0
docs/en/functions/splitting_merging_functions.md Executable file → Normal file
View File

0
docs/en/functions/string_functions.md Executable file → Normal file
View File

1
docs/en/functions/string_replace_functions.md Executable file → Normal file
View File

@ -76,3 +76,4 @@ SELECT replaceRegexpAll('Hello, World!', '^', 'here: ') AS res
│ here: Hello, World! │
└─────────────────────┘
```

11
docs/en/functions/string_search_functions.md Executable file → Normal file
View File

@ -5,14 +5,16 @@ The search substring or regular expression must be a constant in all these funct
## position(haystack, needle)
Search for the 'needle' substring in the 'haystack' string.
Search for the `needle` substring in the `haystack` string.
Returns the position (in bytes) of the found substring, starting from 1, or returns 0 if the substring was not found.
It has also chimpanzees.
For case-insensitive search use `positionCaseInsensitive` function.
## positionUTF8(haystack, needle)
The same as 'position', but the position is returned in Unicode code points. Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, it returns some result (it doesn't throw an exception).
There is also a positionCaseInsensitiveUTF8 function.
The same as `position`, but the position is returned in Unicode code points. Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, it returns some result (it doesn't throw an exception).
For case-insensitive search use `positionCaseInsensitiveUTF8` function.
## match(haystack, pattern)
@ -49,4 +51,3 @@ For other regular expressions, the code is the same as for the 'match' function.
## notLike(haystack, pattern), haystack NOT LIKE pattern operator
The same thing as 'like', but negative.

0
docs/en/functions/type_conversion_functions.md Executable file → Normal file
View File

0
docs/en/functions/url_functions.md Executable file → Normal file
View File

0
docs/en/functions/ym_dict_functions.md Executable file → Normal file
View File

View File

@ -118,4 +118,3 @@ GROUP BY sourceIP
ORDER BY totalRevenue DESC
LIMIT 1
```

0
docs/en/getting_started/example_datasets/criteo.md Executable file → Normal file
View File

10
docs/en/getting_started/example_datasets/nyc_taxi.md Executable file → Normal file
View File

@ -301,14 +301,19 @@ SELECT passenger_count, toYear(pickup_date) AS year, count(*) FROM trips_mergetr
Q4:
```sql
SELECT passenger_count, toYear(pickup_date) AS year, round(trip_distance) AS distance, count(*)FROM trips_mergetreeGROUP BY passenger_count, year, distanceORDER BY year, count(*) DESC
SELECT passenger_count, toYear(pickup_date) AS year, round(trip_distance) AS distance, count(*)
FROM trips_mergetree
GROUP BY passenger_count, year, distance
ORDER BY year, count(*) DESC
```
3.593 seconds.
The following server was used:
Two Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz, 16 physical kernels total,128 GiB RAM,8x6 TB HD on hardware RAID-5
Two Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz, 16 physical kernels total,
128 GiB RAM,
8x6 TB HD on hardware RAID-5
Execution time is the best of three runsBut starting from the second run, queries read data from the file system cache. No further caching occurs: the data is read out and processed in each run.
@ -361,4 +366,3 @@ nodes Q1 Q2 Q3 Q4
3 0.212 0.438 0.733 1.241
140 0.028 0.043 0.051 0.072
```

1
docs/en/getting_started/example_datasets/ontime.md Executable file → Normal file
View File

@ -316,4 +316,3 @@ SELECT OriginCityName, DestCityName, count() AS c FROM ontime GROUP BY OriginCit
SELECT OriginCityName, count() AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;
```

View File

@ -82,4 +82,3 @@ Downloading data (change 'customer' to 'customerd' in the distributed version):
cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INTO customer FORMAT CSV"
cat lineorder.tbl | clickhouse-client --query "INSERT INTO lineorder FORMAT CSV"
```

1
docs/en/getting_started/example_datasets/wikistat.md Executable file → Normal file
View File

@ -24,4 +24,3 @@ for i in {2007..2016}; do for j in {01..12}; do echo $i-$j >&2; curl -sSL "http:
cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pagecounts-raw/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1/')/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1-\2/')/$link; done
ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done
```

9
docs/en/getting_started/index.md Executable file → Normal file
View File

@ -16,7 +16,7 @@ The terminal must use UTF-8 encoding (the default in Ubuntu).
For testing and development, the system can be installed on a single server or on a desktop computer.
### Installing from packages Debian/Ubuntu
### Installing from packages for Debian/Ubuntu
In `/etc/apt/sources.list` (or in a separate `/etc/apt/sources.list.d/clickhouse.list` file), add the repository:
@ -34,8 +34,7 @@ sudo apt-get update
sudo apt-get install clickhouse-client clickhouse-server
```
You can also download and install packages manually from here:
<https://repo.yandex.ru/clickhouse/deb/stable/main/>
You can also download and install packages manually from here: <https://repo.yandex.ru/clickhouse/deb/stable/main/>.
ClickHouse contains access restriction settings. They are located in the 'users.xml' file (next to 'config.xml').
By default, access is allowed from anywhere for the 'default' user, without a password. See 'user/default/networks'.
@ -101,8 +100,7 @@ clickhouse-client
```
The default parameters indicate connecting with localhost:9000 on behalf of the user 'default' without a password.
The client can be used for connecting to a remote server.
Example:
The client can be used for connecting to a remote server. Example:
```bash
clickhouse-client --host=example.com
@ -134,3 +132,4 @@ SELECT 1
**Congratulations, the system works!**
To continue experimenting, you can try to download from the test data sets.

3
docs/en/index.md Executable file → Normal file
View File

@ -39,7 +39,7 @@ We'll say that the following is true for the OLAP (online analytical processing)
- Data is updated in fairly large batches (> 1000 rows), not by single rows; or it is not updated at all.
- Data is added to the DB but is not modified.
- For reads, quite a large number of rows are extracted from the DB, but only a small subset of columns.
- Tables are "wide", meaning they contain a large number of columns.
- Tables are "wide," meaning they contain a large number of columns.
- Queries are relatively rare (usually hundreds of queries per server or less per second).
- For simple queries, latencies around 50 ms are allowed.
- Column values are fairly small: numbers and short strings (for example, 60 bytes per URL).
@ -120,3 +120,4 @@ There are two ways to do this:
This is not done in "normal" databases, because it doesn't make sense when running simple queries. However, there are exceptions. For example, MemSQL uses code generation to reduce latency when processing SQL queries. (For comparison, analytical DBMSs require optimization of throughput, not latency.)
Note that for CPU efficiency, the query language must be declarative (SQL or MDX), or at least a vector (J, K). The query should only contain implicit loops, allowing for optimization.

1
docs/en/interfaces/cli.md Executable file → Normal file
View File

@ -31,6 +31,7 @@ _EOF
cat file.csv | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
```
In batch mode, the default data format is TabSeparated. You can set the format in the FORMAT clause of the query.
By default, you can only process a single query in batch mode. To make multiple queries from a "script," use the --multiquery parameter. This works for all queries except INSERT. Query results are output consecutively without additional separators.
Similarly, to process a large number of queries, you can run 'clickhouse-client' for each query. Note that it may take tens of milliseconds to launch the 'clickhouse-client' program.

17
docs/en/interfaces/http_interface.md Executable file → Normal file
View File

@ -130,14 +130,13 @@ POST 'http://localhost:8123/?query=DROP TABLE t'
For successful requests that don't return a data table, an empty response body is returned.
You can use compression when transmitting data.
You can use the internal ClickHouse compression format when transmitting data. The compressed data has a non-standard format, and you will need to use the special clickhouse-compressor program to work with it (it is installed with the clickhouse-client package).
For using ClickHouse internal compression format, and you will need to use the special compressor program to work with it (sudo apt-get install compressor-metrika-yandex).
If you specified 'compress=1' in the URL, the server will compress the data it sends you.
If you specified 'decompress=1' in the URL, the server will decompress the same data that you pass in the POST method.
Also standard gzip-based HTTP compression can be used. To send gzip compressed POST data just add `Content-Encoding: gzip` to request headers, and gzip POST body.
To get response compressed, you need to add `Accept-Encoding: gzip` to request headers, and turn on ClickHouse setting called `enable_http_compression`.
It is also possible to use the standard gzip-based HTTP compression. To send a POST request compressed using gzip, append the request header `Content-Encoding: gzip`.
In order for ClickHouse to compress the response using gzip, you must append `Accept-Encoding: gzip` to the request headers, and enable the ClickHouse setting `enable_http_compression`.
You can use this to reduce network traffic when transmitting a large amount of data, or for creating dumps that are immediately compressed.
@ -174,7 +173,8 @@ echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @
```
If the user name is not indicated, the username 'default' is used. If the password is not indicated, an empty password is used.
You can also use the URL parameters to specify any settings for processing a single query, or entire profiles of settings. Example:http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1
You can also use the URL parameters to specify any settings for processing a single query, or entire profiles of settings. Example:
http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1
For more information, see the section "Settings".
@ -194,11 +194,11 @@ $ echo 'SELECT number FROM system.numbers LIMIT 10' | curl 'http://localhost:812
For information about other parameters, see the section "SET".
You can use ClickHouse sessions in the HTTP protocol. To do this, you need to specify the `session_id` GET parameter in HTTP request. You can use any alphanumeric string as a session_id. By default session will be timed out after 60 seconds of inactivity. You can change that by setting `default_session_timeout` in server config file, or by adding GET parameter `session_timeout`. You can also check the status of the session by using GET parameter `session_check=1`. When using sessions you can't run 2 queries with the same session_id simultaneously.
Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you need to add the `session_id` GET parameter to the request. You can use any string as the session ID. By default, the session is terminated after 60 seconds of inactivity. To change this timeout, modify the `default_session_timeout` setting in the server configuration, or add the `session_timeout` GET parameter to the request. To check the session status, use the `session_check=1` parameter. Only one query at a time can be executed within a single session.
You can get the progress of query execution in X-ClickHouse-Progress headers, by enabling setting send_progress_in_http_headers.
You have the option to receive information about the progress of query execution in X-ClickHouse-Progress headers. To do this, enable the setting send_progress_in_http_headers.
Running query are not aborted automatically after closing HTTP connection. Parsing and data formatting are performed on the server side, and using the network might be ineffective.
Running requests don't stop automatically if the HTTP connection is lost. Parsing and data formatting are performed on the server side, and using the network might be ineffective.
The optional 'query_id' parameter can be passed as the query ID (any string). For more information, see the section "Settings, replace_running_query".
The optional 'quota_key' parameter can be passed as the quota key (any string). For more information, see the section "Quotas".
@ -220,3 +220,4 @@ curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wa
```
Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client side, the error can only be detected at the parsing stage.

0
docs/en/interfaces/index.md Executable file → Normal file
View File

0
docs/en/interfaces/jdbc.md Executable file → Normal file
View File

0
docs/en/interfaces/tcp.md Executable file → Normal file
View File

51
docs/en/interfaces/third-party_client_libraries.md Executable file → Normal file
View File

@ -3,41 +3,40 @@
There are libraries for working with ClickHouse for:
- Python
- [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm)
- [sqlalchemy-clickhouse](https://github.com/cloudflare/sqlalchemy-clickhouse)
- [clickhouse-driver](https://github.com/mymarilyn/clickhouse-driver)
- [clickhouse-client](https://github.com/yurial/clickhouse-client)
- [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm)
- [sqlalchemy-clickhouse](https://github.com/cloudflare/sqlalchemy-clickhouse)
- [clickhouse-driver](https://github.com/mymarilyn/clickhouse-driver)
- [clickhouse-client](https://github.com/yurial/clickhouse-client)
- PHP
- [clickhouse-php-client](https://github.com/8bitov/clickhouse-php-client)
- [PhpClickHouseClient](https://github.com/SevaCode/PhpClickHouseClient)
- [phpClickHouse](https://github.com/smi2/phpClickHouse)
- [clickhouse-client](https://github.com/bozerkins/clickhouse-client)
- [clickhouse-php-client](https://github.com/8bitov/clickhouse-php-client)
- [PhpClickHouseClient](https://github.com/SevaCode/PhpClickHouseClient)
- [phpClickHouse](https://github.com/smi2/phpClickHouse)
- [clickhouse-client](https://github.com/bozerkins/clickhouse-client)
- Go
- [clickhouse](https://github.com/kshvakov/clickhouse/)
- [go-clickhouse](https://github.com/roistat/go-clickhouse)
- [mailrugo-clickhouse](https://github.com/mailru/go-clickhouse)
- [golang-clickhouse](https://github.com/leprosus/golang-clickhouse)
- [clickhouse](https://github.com/kshvakov/clickhouse/)
- [go-clickhouse](https://github.com/roistat/go-clickhouse)
- [mailrugo-clickhouse](https://github.com/mailru/go-clickhouse)
- [golang-clickhouse](https://github.com/leprosus/golang-clickhouse)
- NodeJs
- [clickhouse (NodeJs)](https://github.com/TimonKK/clickhouse)
- [node-clickhouse](https://github.com/apla/node-clickhouse)
- [clickhouse (NodeJs)](https://github.com/TimonKK/clickhouse)
- [node-clickhouse](https://github.com/apla/node-clickhouse)
- Perl
- [perl-DBD-ClickHouse](https://github.com/elcamlost/perl-DBD-ClickHouse)
- [HTTP-ClickHouse](https://metacpan.org/release/HTTP-ClickHouse)
- [AnyEvent-ClickHouse](https://metacpan.org/release/AnyEvent-ClickHouse)
- [perl-DBD-ClickHouse](https://github.com/elcamlost/perl-DBD-ClickHouse)
- [HTTP-ClickHouse](https://metacpan.org/release/HTTP-ClickHouse)
- [AnyEvent-ClickHouse](https://metacpan.org/release/AnyEvent-ClickHouse)
- Ruby
- [clickhouse (Ruby)](https://github.com/archan937/clickhouse)
- [clickhouse (Ruby)](https://github.com/archan937/clickhouse)
- R
- [clickhouse-r](https://github.com/hannesmuehleisen/clickhouse-r)
- [RClickhouse](https://github.com/IMSMWU/RClickhouse)
- [clickhouse-r](https://github.com/hannesmuehleisen/clickhouse-r)
- [RClickhouse](https://github.com/IMSMWU/RClickhouse)
- .NET
- [ClickHouse-Net](https://github.com/killwort/ClickHouse-Net)
- [ClickHouse-Net](https://github.com/killwort/ClickHouse-Net)
- C++
- [clickhouse-cpp](https://github.com/artpaul/clickhouse-cpp/)
- [clickhouse-cpp](https://github.com/artpaul/clickhouse-cpp/)
- Elixir
- [clickhousex](https://github.com/appodeal/clickhousex/)
- [clickhouse_ecto](https://github.com/appodeal/clickhouse_ecto)
- [clickhousex](https://github.com/appodeal/clickhousex/)
- [clickhouse_ecto](https://github.com/appodeal/clickhouse_ecto)
- Java
- [clickhouse-client-java](https://github.com/VirtusAI/clickhouse-client-java)
- [clickhouse-client-java](https://github.com/VirtusAI/clickhouse-client-java)
We have not tested these libraries. They are listed in random order.

0
docs/en/interfaces/third-party_gui.md Executable file → Normal file
View File

0
docs/en/introduction/distinctive_features.md Executable file → Normal file
View File

View File

0
docs/en/introduction/index.md Executable file → Normal file
View File

0
docs/en/introduction/performance.md Executable file → Normal file
View File

Some files were not shown because too many files have changed in this diff Show More