WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages
This commit is contained in:
Ivan Blinkov 2018-10-16 13:47:17 +03:00 committed by GitHub
parent 3359ba06c3
commit 8623cb232c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
348 changed files with 2000 additions and 1470 deletions

View File

@ -48,7 +48,7 @@ Some additional configuration has to be done to actually make new language live
* Inline piece of code is <code>&#96;in backticks&#96;</code>.
* Multiline code block are <code>&#96;&#96;&#96;in triple backtick quotes &#96;&#96;&#96;</code>.
* Brightly highlighted block of text starts with `!!! info "Header"`, on next line 4 spaces and content. Instead of `info` can be `warning`.
* Hide block to be opened by click: `<details> <summary>Header</summary> hidden content</details>`.
* Hide block to be opened by click: `<details markdown="1"> <summary>Header</summary> hidden content</details>`.
* Colored text: `<span style="color: red;">text</span>`.
* Additional anchor to be linked to: `<a name="my_anchor"></a>`, for headers fully in English they are created automatically like `"FoO Bar" -> "foo-bar"`.
* Table:

View File

@ -83,3 +83,5 @@ Code: 386. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception
0 rows in set. Elapsed: 0.246 sec.
```
[Original article](https://clickhouse.yandex/docs/en/data_types/array/) <!--hide-->

View File

@ -2,3 +2,5 @@
There isn't a separate type for boolean values. They use the UInt8 type, restricted to the values 0 or 1.
[Original article](https://clickhouse.yandex/docs/en/data_types/boolean/) <!--hide-->

View File

@ -5,3 +5,5 @@ The minimum value is output as 0000-00-00.
The date is stored without the time zone.
[Original article](https://clickhouse.yandex/docs/en/data_types/date/) <!--hide-->

View File

@ -13,3 +13,5 @@ By default, the client switches to the timezone of the server when it connects.
So when working with a textual date (for example, when saving text dumps), keep in mind that there may be ambiguity during changes for daylight savings time, and there may be problems matching data if the time zone changed.
[Original article](https://clickhouse.yandex/docs/en/data_types/datetime/) <!--hide-->

View File

@ -95,3 +95,5 @@ SELECT toDecimal32(1, 8) < 100
```
DB::Exception: Can't compare.
```
[Original article](https://clickhouse.yandex/docs/en/data_types/decimal/) <!--hide-->

View File

@ -113,3 +113,5 @@ The Enum type can be changed without cost using ALTER, if only the set of values
Using ALTER, it is possible to change an Enum8 to an Enum16 or vice versa, just like changing an Int8 to Int16.
[Original article](https://clickhouse.yandex/docs/en/data_types/enum/) <!--hide-->

View File

@ -8,3 +8,5 @@ Note that this behavior differs from MySQL behavior for the CHAR type (where str
Fewer functions can work with the FixedString(N) type than with String, so it is less convenient to use.
[Original article](https://clickhouse.yandex/docs/en/data_types/fixedstring/) <!--hide-->

View File

@ -13,7 +13,7 @@ We recommend that you store data in integer form whenever possible. For example,
- Computations with floating-point numbers might produce a rounding error.
```sql
``` sql
SELECT 1 - 0.9
```
@ -33,7 +33,7 @@ In contrast to standard SQL, ClickHouse supports the following categories of flo
- `Inf` Infinity.
```sql
``` sql
SELECT 0.5 / 0
```
@ -45,7 +45,7 @@ SELECT 0.5 / 0
- `-Inf` Negative infinity.
```sql
``` sql
SELECT -0.5 / 0
```
@ -69,3 +69,5 @@ SELECT 0 / 0
See the rules for `NaN` sorting in the section [ORDER BY clause](../query_language/select.md#query_language-queries-order_by).
[Original article](https://clickhouse.yandex/docs/en/data_types/float/) <!--hide-->

View File

@ -6,3 +6,5 @@ ClickHouse can store various types of data in table cells.
This section describes the supported data types and special considerations when using and/or implementing them, if any.
[Original article](https://clickhouse.yandex/docs/en/data_types/) <!--hide-->

View File

@ -18,3 +18,5 @@ Fixed-length integers, with or without a sign.
- UInt32 - [0 : 4294967295]
- UInt64 - [0 : 18446744073709551615]
[Original article](https://clickhouse.yandex/docs/en/data_types/int_uint/) <!--hide-->

View File

@ -2,3 +2,5 @@
The intermediate state of an aggregate function. To get it, use aggregate functions with the '-State' suffix. For more information, see "AggregatingMergeTree".
[Original article](https://clickhouse.yandex/docs/en/data_types/nested_data_structures/aggregatefunction/) <!--hide-->

View File

@ -1,2 +1,4 @@
# Nested Data Structures
[Original article](https://clickhouse.yandex/docs/en/data_types/nested_data_structures/) <!--hide-->

View File

@ -4,7 +4,7 @@ A nested data structure is like a nested table. The parameters of a nested data
Example:
```sql
``` sql
CREATE TABLE test.visits
(
CounterID UInt32,
@ -35,7 +35,7 @@ In most cases, when working with a nested data structure, its individual columns
Example:
```sql
``` sql
SELECT
Goals.ID,
Goals.EventTime
@ -44,7 +44,7 @@ WHERE CounterID = 101500 AND length(Goals.ID) < 5
LIMIT 10
```
```text
```
┌─Goals.ID───────────────────────┬─Goals.EventTime───────────────────────────────────────────────────────────────────────────┐
│ [1073752,591325,591325] │ ['2014-03-17 16:38:10','2014-03-17 16:38:48','2014-03-17 16:42:27'] │
│ [1073752] │ ['2014-03-17 00:28:25'] │
@ -63,7 +63,7 @@ It is easiest to think of a nested data structure as a set of multiple column ar
The only place where a SELECT query can specify the name of an entire nested data structure instead of individual columns is the ARRAY JOIN clause. For more information, see "ARRAY JOIN clause". Example:
```sql
``` sql
SELECT
Goal.ID,
Goal.EventTime
@ -73,7 +73,7 @@ WHERE CounterID = 101500 AND length(Goals.ID) < 5
LIMIT 10
```
```text
```
┌─Goal.ID─┬──────Goal.EventTime─┐
│ 1073752 │ 2014-03-17 16:38:10 │
│ 591325 │ 2014-03-17 16:38:48 │
@ -96,3 +96,5 @@ For a DESCRIBE query, the columns in a nested data structure are listed separate
The ALTER query is very limited for elements in a nested data structure.
[Original article](https://clickhouse.yandex/docs/en/data_types/nested_data_structures/nested/) <!--hide-->

View File

@ -53,3 +53,5 @@ FROM t_null
2 rows in set. Elapsed: 0.144 sec.
```
[Original article](https://clickhouse.yandex/docs/en/data_types/nullable/) <!--hide-->

View File

@ -2,3 +2,5 @@
Used for representing lambda expressions in high-order functions.
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/expression/) <!--hide-->

View File

@ -2,3 +2,5 @@
Special data type values can't be saved to a table or output in results, but are used as the intermediate result of running a query.
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/) <!--hide-->

View File

@ -20,3 +20,5 @@ SELECT toTypeName([])
1 rows in set. Elapsed: 0.062 sec.
```
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/nothing/) <!--hide-->

View File

@ -2,3 +2,5 @@
Used for the right half of an IN expression.
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/set/) <!--hide-->

View File

@ -12,3 +12,5 @@ If you need to store texts, we recommend using UTF-8 encoding. At the very least
Similarly, certain functions for working with strings have separate variations that work under the assumption that the string contains a set of bytes representing a UTF-8 encoded text.
For example, the 'length' function calculates the string length in bytes, while the 'lengthUTF8' function calculates the string length in Unicode code points, assuming that the value is UTF-8 encoded.
[Original article](https://clickhouse.yandex/docs/en/data_types/string/) <!--hide-->

View File

@ -52,3 +52,5 @@ SELECT
1 rows in set. Elapsed: 0.002 sec.
```
[Original article](https://clickhouse.yandex/docs/en/data_types/tuple/) <!--hide-->

View File

@ -193,3 +193,5 @@ In addition, each replica stores its state in ZooKeeper as the set of parts and
> The ClickHouse cluster consists of independent shards, and each shard consists of replicas. The cluster is not elastic, so after adding a new shard, data is not rebalanced between shards automatically. Instead, the cluster load will be uneven. This implementation gives you more control, and it is fine for relatively small clusters such as tens of nodes. But for clusters with hundreds of nodes that we are using in production, this approach becomes a significant drawback. We should implement a table engine that will span its data across the cluster with dynamically replicated regions that could be split and balanced between clusters automatically.
[Original article](https://clickhouse.yandex/docs/en/development/architecture/) <!--hide-->

View File

@ -95,3 +95,5 @@ cd ..
To create an executable, run `ninja clickhouse`.
This will create the `dbms/programs/clickhouse` executable, which can be used with `client` or `server` arguments.
[Original article](https://clickhouse.yandex/docs/en/development/build/) <!--hide-->

View File

@ -79,3 +79,5 @@ Reboot.
To check if it's working, you can use `ulimit -n` command.
[Original article](https://clickhouse.yandex/docs/en/development/build_osx/) <!--hide-->

View File

@ -1,2 +1,4 @@
# ClickHouse Development
[Original article](https://clickhouse.yandex/docs/en/development/) <!--hide-->

View File

@ -834,3 +834,5 @@ function(
const & RangesInDataParts ranges,
size_t limit)
```
[Original article](https://clickhouse.yandex/docs/en/development/style/) <!--hide-->

View File

@ -249,3 +249,5 @@ In Travis CI due to limit on time and computational power we can afford only sub
In Jenkins we run functional tests for each commit and for each pull request from trusted users; the same under ASan; we also run quorum tests, dictionary tests, Metrica B2B tests. We use Jenkins to prepare and publish releases. Worth to note that we are not happy with Jenkins at all.
One of our goals is to provide reliable testing infrastructure that will be available to community.
[Original article](https://clickhouse.yandex/docs/en/development/tests/) <!--hide-->

View File

@ -11,3 +11,5 @@ Distributed sorting is one of the main causes of reduced performance when runnin
Most MapReduce implementations allow you to execute arbitrary code on a cluster. But a declarative query language is better suited to OLAP in order to run experiments quickly. For example, Hadoop has Hive and Pig. Also consider Cloudera Impala or Shark (outdated) for Spark, as well as Spark SQL, Presto, and Apache Drill. Performance when running such tasks is highly sub-optimal compared to specialized systems, but relatively high latency makes it unrealistic to use these systems as the backend for a web interface.
[Original article](https://clickhouse.yandex/docs/en/faq/general/) <!--hide-->

View File

@ -21,7 +21,7 @@ cd ..
Run the following ClickHouse queries:
```sql
``` sql
CREATE TABLE rankings_tiny
(
pageURL String,
@ -96,7 +96,7 @@ for i in 5nodes/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i |
Queries for obtaining data samples:
```sql
``` sql
SELECT pageURL, pageRank FROM rankings_1node WHERE pageRank > 1000
SELECT substring(sourceIP, 1, 8), sum(adRevenue) FROM uservisits_1node GROUP BY substring(sourceIP, 1, 8)
@ -119,3 +119,5 @@ ORDER BY totalRevenue DESC
LIMIT 1
```
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/amplab_benchmark/) <!--hide-->

View File

@ -4,7 +4,7 @@ Download the data from <http://labs.criteo.com/downloads/download-terabyte-click
Create a table to import the log to:
```sql
``` sql
CREATE TABLE criteo_log (date Date, clicked UInt8, int1 Int32, int2 Int32, int3 Int32, int4 Int32, int5 Int32, int6 Int32, int7 Int32, int8 Int32, int9 Int32, int10 Int32, int11 Int32, int12 Int32, int13 Int32, cat1 String, cat2 String, cat3 String, cat4 String, cat5 String, cat6 String, cat7 String, cat8 String, cat9 String, cat10 String, cat11 String, cat12 String, cat13 String, cat14 String, cat15 String, cat16 String, cat17 String, cat18 String, cat19 String, cat20 String, cat21 String, cat22 String, cat23 String, cat24 String, cat25 String, cat26 String) ENGINE = Log
```
@ -16,7 +16,7 @@ for i in {00..23}; do echo $i; zcat datasets/criteo/day_${i#0}.gz | sed -r 's/^/
Create a table for the converted data:
```sql
``` sql
CREATE TABLE criteo
(
date Date,
@ -65,9 +65,11 @@ CREATE TABLE criteo
Transform data from the raw log and put it in the second table:
```sql
``` sql
INSERT INTO criteo SELECT date, clicked, int1, int2, int3, int4, int5, int6, int7, int8, int9, int10, int11, int12, int13, reinterpretAsUInt32(unhex(cat1)) AS icat1, reinterpretAsUInt32(unhex(cat2)) AS icat2, reinterpretAsUInt32(unhex(cat3)) AS icat3, reinterpretAsUInt32(unhex(cat4)) AS icat4, reinterpretAsUInt32(unhex(cat5)) AS icat5, reinterpretAsUInt32(unhex(cat6)) AS icat6, reinterpretAsUInt32(unhex(cat7)) AS icat7, reinterpretAsUInt32(unhex(cat8)) AS icat8, reinterpretAsUInt32(unhex(cat9)) AS icat9, reinterpretAsUInt32(unhex(cat10)) AS icat10, reinterpretAsUInt32(unhex(cat11)) AS icat11, reinterpretAsUInt32(unhex(cat12)) AS icat12, reinterpretAsUInt32(unhex(cat13)) AS icat13, reinterpretAsUInt32(unhex(cat14)) AS icat14, reinterpretAsUInt32(unhex(cat15)) AS icat15, reinterpretAsUInt32(unhex(cat16)) AS icat16, reinterpretAsUInt32(unhex(cat17)) AS icat17, reinterpretAsUInt32(unhex(cat18)) AS icat18, reinterpretAsUInt32(unhex(cat19)) AS icat19, reinterpretAsUInt32(unhex(cat20)) AS icat20, reinterpretAsUInt32(unhex(cat21)) AS icat21, reinterpretAsUInt32(unhex(cat22)) AS icat22, reinterpretAsUInt32(unhex(cat23)) AS icat23, reinterpretAsUInt32(unhex(cat24)) AS icat24, reinterpretAsUInt32(unhex(cat25)) AS icat25, reinterpretAsUInt32(unhex(cat26)) AS icat26 FROM criteo_log;
DROP TABLE criteo_log;
```
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/criteo/) <!--hide-->

File diff suppressed because one or more lines are too long

View File

@ -18,7 +18,7 @@ done
Creating a table:
```sql
``` sql
CREATE TABLE `ontime` (
`Year` UInt16,
`Quarter` UInt8,
@ -142,37 +142,37 @@ Queries:
Q0.
```sql
``` sql
select avg(c1) from (select Year, Month, count(*) as c1 from ontime group by Year, Month);
```
Q1. The number of flights per day from the year 2000 to 2008
```sql
``` sql
SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
```
Q2. The number of flights delayed by more than 10 minutes, grouped by the day of the week, for 2000-2008
```sql
``` sql
SELECT DayOfWeek, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC
```
Q3. The number of delays by airport for 2000-2008
```sql
``` sql
SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10
```
Q4. The number of delays by carrier for 2007
```sql
``` sql
SELECT Carrier, count(*) FROM ontime WHERE DepDelay>10 AND Year = 2007 GROUP BY Carrier ORDER BY count(*) DESC
```
Q5. The percentage of delays by carrier for 2007
```sql
``` sql
SELECT Carrier, c, c2, c*1000/c2 as c3
FROM
(
@ -198,13 +198,13 @@ ORDER BY c3 DESC;
Better version of the same query:
```sql
``` sql
SELECT Carrier, avg(DepDelay > 10) * 1000 AS c3 FROM ontime WHERE Year = 2007 GROUP BY Carrier ORDER BY Carrier
```
Q6. The previous request for a broader range of years, 2000-2008
```sql
``` sql
SELECT Carrier, c, c2, c*1000/c2 as c3
FROM
(
@ -230,13 +230,13 @@ ORDER BY c3 DESC;
Better version of the same query:
```sql
``` sql
SELECT Carrier, avg(DepDelay > 10) * 1000 AS c3 FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier ORDER BY Carrier
```
Q7. Percentage of flights delayed for more than 10 minutes, by year
```sql
``` sql
SELECT Year, c1/c2
FROM
(
@ -260,25 +260,25 @@ ORDER BY Year
Better version of the same query:
```sql
``` sql
SELECT Year, avg(DepDelay > 10) FROM ontime GROUP BY Year ORDER BY Year
```
Q8. The most popular destinations by the number of directly connected cities for various year ranges
```sql
``` sql
SELECT DestCityName, uniqExact(OriginCityName) AS u FROM ontime WHERE Year >= 2000 and Year <= 2010 GROUP BY DestCityName ORDER BY u DESC LIMIT 10;
```
Q9.
```sql
``` sql
select Year, count(*) as c1 from ontime group by Year;
```
Q10.
```sql
``` sql
select
min(Year), max(Year), Carrier, count(*) as cnt,
sum(ArrDelayMinutes>30) as flights_delayed,
@ -296,7 +296,7 @@ LIMIT 1000;
Bonus:
```sql
``` sql
SELECT avg(cnt) FROM (SELECT Year,Month,count(*) AS cnt FROM ontime WHERE DepDel15=1 GROUP BY Year,Month)
select avg(c1) from (select Year,Month,count(*) as c1 from ontime group by Year,Month)
@ -317,3 +317,5 @@ This performance test was created by Vadim Tkachenko. See:
- <https://www.percona.com/blog/2016/01/07/apache-spark-with-air-ontime-performance-data/>
- <http://nickmakos.blogspot.ru/2012/08/analyzing-air-traffic-performance-with.html>
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/ontime/) <!--hide-->

View File

@ -21,7 +21,7 @@ Generating data:
Creating tables in ClickHouse:
```sql
``` sql
CREATE TABLE lineorder (
LO_ORDERKEY UInt32,
LO_LINENUMBER UInt8,
@ -83,3 +83,5 @@ cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INT
cat lineorder.tbl | clickhouse-client --query "INSERT INTO lineorder FORMAT CSV"
```
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/star_schema/) <!--hide-->

View File

@ -4,7 +4,7 @@ See: <http://dumps.wikimedia.org/other/pagecounts-raw/>
Creating a table:
```sql
``` sql
CREATE TABLE wikistat
(
date Date,
@ -25,3 +25,5 @@ cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pageco
ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done
```
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/wikistat/) <!--hide-->

View File

@ -24,7 +24,7 @@ For testing and development, the system can be installed on a single server or o
In `/etc/apt/sources.list` (or in a separate `/etc/apt/sources.list.d/clickhouse.list` file), add the repository:
```text
```
deb http://repo.yandex.ru/clickhouse/deb/stable/ main/
```
@ -51,14 +51,14 @@ To compile, follow the instructions: build.md
You can compile packages and install them.
You can also use programs without installing packages.
```text
```
Client: dbms/programs/clickhouse-client
Server: dbms/programs/clickhouse-server
```
For the server, create a catalog with data, such as:
```text
```
/opt/clickhouse/data/default/
/opt/clickhouse/metadata/default/
```
@ -137,3 +137,5 @@ SELECT 1
To continue experimenting, you can try to download from the test data sets.
[Original article](https://clickhouse.yandex/docs/en/getting_started/) <!--hide-->

View File

@ -77,9 +77,8 @@ See the difference?
For example, the query "count the number of records for each advertising platform" requires reading one "advertising platform ID" column, which takes up 1 byte uncompressed. If most of the traffic was not from advertising platforms, you can expect at least 10-fold compression of this column. When using a quick compression algorithm, data decompression is possible at a speed of at least several gigabytes of uncompressed data per second. In other words, this query can be processed at a speed of approximately several billion rows per second on a single server. This speed is actually achieved in practice.
<details><summary>Example</summary>
<p>
<pre>
<details markdown="1"><summary>Example</summary>
```
$ clickhouse-client
ClickHouse client version 0.0.52053.
Connecting to localhost:9000.
@ -120,9 +119,10 @@ LIMIT 20
20 rows in set. Elapsed: 0.153 sec. Processed 1.00 billion rows, 4.00 GB (6.53 billion rows/s., 26.10 GB/s.)
:)</pre>
:)
```
</p></details>
</details>
### CPU
@ -138,3 +138,5 @@ There are two ways to do this:
This is not done in "normal" databases, because it doesn't make sense when running simple queries. However, there are exceptions. For example, MemSQL uses code generation to reduce latency when processing SQL queries. (For comparison, analytical DBMSs require optimization of throughput, not latency.)
Note that for CPU efficiency, the query language must be declarative (SQL or MDX), or at least a vector (J, K). The query should only contain implicit loops, allowing for optimization.
[Original article](https://clickhouse.yandex/docs/en/) <!--hide-->

View File

@ -113,3 +113,5 @@ Example of a config file:
</config>
```
[Original article](https://clickhouse.yandex/docs/en/interfaces/cli/) <!--hide-->

View File

@ -32,31 +32,131 @@ The table below lists supported formats and how they can be used in `INSERT` and
| [XML](#xml) | ✗ | ✔ |
| [CapnProto](#capnproto) | ✔ | ✔ |
<a name="format_capnproto"></a>
<a name="tabseparated"></a>
## CapnProto
## TabSeparated
Cap'n Proto is a binary message format similar to Protocol Buffers and Thrift, but not like JSON or MessagePack.
In TabSeparated format, data is written by row. Each row contains values separated by tabs. Each value is follow by a tab, except the last value in the row, which is followed by a line feed. Strictly Unix line feeds are assumed everywhere. The last row also must contain a line feed at the end. Values are written in text format, without enclosing quotation marks, and with special characters escaped.
Cap'n Proto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query.
This format is also available under the name `TSV`.
```sql
SELECT SearchPhrase, count() AS c FROM test.hits
GROUP BY SearchPhrase FORMAT CapnProto SETTINGS schema = 'schema:Message'
The `TabSeparated` format is convenient for processing data using custom programs and scripts. It is used by default in the HTTP interface, and in the command-line client's batch mode. This format also allows transferring data between different DBMSs. For example, you can get a dump from MySQL and upload it to ClickHouse, or vice versa.
The `TabSeparated` format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example:
``` sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
```
Where `schema.capnp` looks like this:
```
struct Message {
SearchPhrase @0 :Text;
c @1 :Uint64;
}
2014-03-17 1406958
2014-03-18 1383658
2014-03-19 1405797
2014-03-20 1353623
2014-03-21 1245779
2014-03-22 1031592
2014-03-23 1046491
0000-00-00 8873898
2014-03-17 1031592
2014-03-23 1406958
```
Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md#server_settings-format_schema_path) in the server configuration.
### Data formatting
Integer numbers are written in decimal form. Numbers can contain an extra "+" character at the beginning (ignored when parsing, and not recorded when formatting). Non-negative numbers can't contain the negative sign. When reading, it is allowed to parse an empty string as a zero, or (for signed types) a string consisting of just a minus sign as a zero. Numbers that do not fit into the corresponding data type may be parsed as a different number, without an error message.
Floating-point numbers are written in decimal form. The dot is used as the decimal separator. Exponential entries are supported, as are 'inf', '+inf', '-inf', and 'nan'. An entry of floating-point numbers may begin or end with a decimal point.
During formatting, accuracy may be lost on floating-point numbers.
During parsing, it is not strictly required to read the nearest machine-representable number.
Dates are written in YYYY-MM-DD format and parsed in the same format, but with any characters as separators.
Dates with times are written in the format YYYY-MM-DD hh:mm:ss and parsed in the same format, but with any characters as separators.
This all occurs in the system time zone at the time the client or server starts (depending on which one formats data). For dates with times, daylight saving time is not specified. So if a dump has times during daylight saving time, the dump does not unequivocally match the data, and parsing will select one of the two times.
During a read operation, incorrect dates and dates with times can be parsed with natural overflow or as null dates and times, without an error message.
As an exception, parsing dates with times is also supported in Unix timestamp format, if it consists of exactly 10 decimal digits. The result is not time zone-dependent. The formats YYYY-MM-DD hh:mm:ss and NNNNNNNNNN are differentiated automatically.
Strings are output with backslash-escaped special characters. The following escape sequences are used for output: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\'`, `\\`. Parsing also supports the sequences `\a`, `\v`, and `\xHH` (hex escape sequences) and any `\c` sequences, where `c` is any character (these sequences are converted to `c`). Thus, reading data supports formats where a line feed can be written as `\n` or `\`, or as a line feed. For example, the string `Hello world` with a line feed between the words instead of a space can be parsed in any of the following variations:
```
Hello\nworld
Hello\
world
```
The second variant is supported because MySQL uses it when writing tab-separated dumps.
The minimum set of characters that you need to escape when passing data in TabSeparated format: tab, line feed (LF) and backslash.
Only a small set of symbols are escaped. You can easily stumble onto a string value that your terminal will ruin in output.
Arrays are written as a list of comma-separated values in square brackets. Number items in the array are fomratted as normally, but dates, dates with times, and strings are written in single quotes with the same escaping rules as above.
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
<a name="tabseparatedraw"></a>
## TabSeparatedRaw
Differs from `TabSeparated` format in that the rows are written without escaping.
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
This format is also available under the name `TSVRaw`.
<a name="tabseparatedwithnames"></a>
## TabSeparatedWithNames
Differs from the `TabSeparated` format in that the column names are written in the first row.
During parsing, the first row is completely ignored. You can't use column names to determine their position or to check their correctness.
(Support for parsing the header row may be added in the future.)
This format is also available under the name `TSVWithNames`.
<a name="tabseparatedwithnamesandtypes"></a>
## TabSeparatedWithNamesAndTypes
Differs from the `TabSeparated` format in that the column names are written to the first row, while the column types are in the second row.
During parsing, the first and second rows are completely ignored.
This format is also available under the name `TSVWithNamesAndTypes`.
<a name="tskv"></a>
## TSKV
Similar to TabSeparated, but outputs a value in name=value format. Names are escaped the same way as in TabSeparated format, and the = symbol is also escaped.
```
SearchPhrase= count()=8267016
SearchPhrase=bathroom interior design count()=2166
SearchPhrase=yandex count()=1655
SearchPhrase=2014 spring fashion count()=1549
SearchPhrase=freeform photos count()=1480
SearchPhrase=angelina jolie count()=1245
SearchPhrase=omsk count()=1112
SearchPhrase=photos of dog breeds count()=1091
SearchPhrase=curtain designs count()=1064
SearchPhrase=baku count()=1000
```
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
``` sql
SELECT * FROM t_null FORMAT TSKV
```
```
x=1 y=\N
```
When there is a large number of small columns, this format is ineffective, and there is generally no reason to use it. It is used in some departments of Yandex.
Both data output and parsing are supported in this format. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults.
Parsing allows the presence of the additional field `tskv` without the equal sign or a value. This field is ignored.
Deserialization is effective and usually doesn't increase the system load.
<a name="csv"></a>
## CSV
@ -86,7 +186,7 @@ Also prints the header row, similar to `TabSeparatedWithNames`.
Outputs data in JSON format. Besides data tables, it also outputs column names and types, along with some additional information: the total number of output rows, and the number of rows that could have been output if there weren't a LIMIT. Example:
```sql
``` sql
SELECT SearchPhrase, count() AS c FROM test.hits GROUP BY SearchPhrase WITH TOTALS ORDER BY c DESC LIMIT 5 FORMAT JSON
```
@ -263,7 +363,7 @@ Each result block is output as a separate table. This is necessary so that block
[NULL](../query_language/syntax.md#null-literal) is output as `ᴺᵁᴸᴸ`.
```sql
``` sql
SELECT * FROM t_null
```
@ -278,11 +378,11 @@ This format is only appropriate for outputting a query result, but not for parsi
The Pretty format supports outputting total values (when using WITH TOTALS) and extremes (when 'extremes' is set to 1). In these cases, total values and extreme values are output after the main data, in separate tables. Example (shown for the PrettyCompact format):
```sql
``` sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT PrettyCompact
```
```text
```
┌──EventDate─┬───────c─┐
│ 2014-03-17 │ 1406958 │
│ 2014-03-18 │ 1383658 │
@ -359,131 +459,6 @@ Array is represented as a varint length (unsigned [LEB128](https://en.wikipedia.
For [NULL](../query_language/syntax.md#null-literal) support, an additional byte containing 1 or 0 is added before each [Nullable](../data_types/nullable.md#data_type-nullable) value. If 1, then the value is `NULL` and this byte is interpreted as a separate value. If 0, the value after the byte is not `NULL`.
<a name="tabseparated"></a>
## TabSeparated
In TabSeparated format, data is written by row. Each row contains values separated by tabs. Each value is follow by a tab, except the last value in the row, which is followed by a line feed. Strictly Unix line feeds are assumed everywhere. The last row also must contain a line feed at the end. Values are written in text format, without enclosing quotation marks, and with special characters escaped.
This format is also available under the name `TSV`.
The `TabSeparated` format is convenient for processing data using custom programs and scripts. It is used by default in the HTTP interface, and in the command-line client's batch mode. This format also allows transferring data between different DBMSs. For example, you can get a dump from MySQL and upload it to ClickHouse, or vice versa.
The `TabSeparated` format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example:
```sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
```
```text
2014-03-17 1406958
2014-03-18 1383658
2014-03-19 1405797
2014-03-20 1353623
2014-03-21 1245779
2014-03-22 1031592
2014-03-23 1046491
0000-00-00 8873898
2014-03-17 1031592
2014-03-23 1406958
```
## Data formatting
Integer numbers are written in decimal form. Numbers can contain an extra "+" character at the beginning (ignored when parsing, and not recorded when formatting). Non-negative numbers can't contain the negative sign. When reading, it is allowed to parse an empty string as a zero, or (for signed types) a string consisting of just a minus sign as a zero. Numbers that do not fit into the corresponding data type may be parsed as a different number, without an error message.
Floating-point numbers are written in decimal form. The dot is used as the decimal separator. Exponential entries are supported, as are 'inf', '+inf', '-inf', and 'nan'. An entry of floating-point numbers may begin or end with a decimal point.
During formatting, accuracy may be lost on floating-point numbers.
During parsing, it is not strictly required to read the nearest machine-representable number.
Dates are written in YYYY-MM-DD format and parsed in the same format, but with any characters as separators.
Dates with times are written in the format YYYY-MM-DD hh:mm:ss and parsed in the same format, but with any characters as separators.
This all occurs in the system time zone at the time the client or server starts (depending on which one formats data). For dates with times, daylight saving time is not specified. So if a dump has times during daylight saving time, the dump does not unequivocally match the data, and parsing will select one of the two times.
During a read operation, incorrect dates and dates with times can be parsed with natural overflow or as null dates and times, without an error message.
As an exception, parsing dates with times is also supported in Unix timestamp format, if it consists of exactly 10 decimal digits. The result is not time zone-dependent. The formats YYYY-MM-DD hh:mm:ss and NNNNNNNNNN are differentiated automatically.
Strings are output with backslash-escaped special characters. The following escape sequences are used for output: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\'`, `\\`. Parsing also supports the sequences `\a`, `\v`, and `\xHH` (hex escape sequences) and any `\c` sequences, where `c` is any character (these sequences are converted to `c`). Thus, reading data supports formats where a line feed can be written as `\n` or `\`, or as a line feed. For example, the string `Hello world` with a line feed between the words instead of a space can be parsed in any of the following variations:
```text
Hello\nworld
Hello\
world
```
The second variant is supported because MySQL uses it when writing tab-separated dumps.
The minimum set of characters that you need to escape when passing data in TabSeparated format: tab, line feed (LF) and backslash.
Only a small set of symbols are escaped. You can easily stumble onto a string value that your terminal will ruin in output.
Arrays are written as a list of comma-separated values in square brackets. Number items in the array are fomratted as normally, but dates, dates with times, and strings are written in single quotes with the same escaping rules as above.
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
<a name="tabseparatedraw"></a>
## TabSeparatedRaw
Differs from `TabSeparated` format in that the rows are written without escaping.
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
This format is also available under the name `TSVRaw`.
<a name="tabseparatedwithnames"></a>
## TabSeparatedWithNames
Differs from the `TabSeparated` format in that the column names are written in the first row.
During parsing, the first row is completely ignored. You can't use column names to determine their position or to check their correctness.
(Support for parsing the header row may be added in the future.)
This format is also available under the name `TSVWithNames`.
<a name="tabseparatedwithnamesandtypes"></a>
## TabSeparatedWithNamesAndTypes
Differs from the `TabSeparated` format in that the column names are written to the first row, while the column types are in the second row.
During parsing, the first and second rows are completely ignored.
This format is also available under the name `TSVWithNamesAndTypes`.
<a name="tskv"></a>
## TSKV
Similar to TabSeparated, but outputs a value in name=value format. Names are escaped the same way as in TabSeparated format, and the = symbol is also escaped.
```text
SearchPhrase= count()=8267016
SearchPhrase=bathroom interior design count()=2166
SearchPhrase=yandex count()=1655
SearchPhrase=2014 spring fashion count()=1549
SearchPhrase=freeform photos count()=1480
SearchPhrase=angelina jolie count()=1245
SearchPhrase=omsk count()=1112
SearchPhrase=photos of dog breeds count()=1091
SearchPhrase=curtain designs count()=1064
SearchPhrase=baku count()=1000
```
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
```sql
SELECT * FROM t_null FORMAT TSKV
```
```
x=1 y=\N
```
When there is a large number of small columns, this format is ineffective, and there is generally no reason to use it. It is used in some departments of Yandex.
Both data output and parsing are supported in this format. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults.
Parsing allows the presence of the additional field `tskv` without the equal sign or a value. This field is ignored.
## Values
Prints every row in brackets. Rows are separated by commas. There is no comma after the last row. The values inside the brackets are also comma-separated. Numbers are output in decimal format without quotes. Arrays are output in square brackets. Strings, dates, and dates with times are output in quotes. Escaping rules and parsing are similar to the [TabSeparated](#tabseparated) format. During formatting, extra spaces aren't inserted, but during parsing, they are allowed and skipped (except for spaces inside array values, which are not allowed). [NULL](../query_language/syntax.md#null-literal) is represented as `NULL`.
@ -502,7 +477,7 @@ Prints each value on a separate line with the column name specified. This format
Example:
```sql
``` sql
SELECT * FROM t_null FORMAT Vertical
```
@ -620,3 +595,31 @@ Just as for JSON, invalid UTF-8 sequences are changed to the replacement charact
In string values, the characters `<` and `&` are escaped as `<` and `&`.
Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`,and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.
<a name="format_capnproto"></a>
## CapnProto
Cap'n Proto is a binary message format similar to Protocol Buffers and Thrift, but not like JSON or MessagePack.
Cap'n Proto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query.
``` sql
SELECT SearchPhrase, count() AS c FROM test.hits
GROUP BY SearchPhrase FORMAT CapnProto SETTINGS schema = 'schema:Message'
```
Where `schema.capnp` looks like this:
```
struct Message {
SearchPhrase @0 :Text;
c @1 :Uint64;
}
```
Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md#server_settings-format_schema_path) in the server configuration.
Deserialization is effective and usually doesn't increase the system load.
[Original article](https://clickhouse.yandex/docs/en/interfaces/formats/) <!--hide-->

View File

@ -218,3 +218,5 @@ curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wa
Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client side, the error can only be detected at the parsing stage.
[Original article](https://clickhouse.yandex/docs/en/interfaces/http_interface/) <!--hide-->

View File

@ -4,3 +4,5 @@
To explore the system's capabilities, download data to tables, or make manual queries, use the clickhouse-client program.
[Original article](https://clickhouse.yandex/docs/en/interfaces/) <!--hide-->

View File

@ -3,3 +3,5 @@
- [Official driver](https://github.com/yandex/clickhouse-jdbc).
- Third-party driver from [ClickHouse-Native-JDBC](https://github.com/housepower/ClickHouse-Native-JDBC).
[Original article](https://clickhouse.yandex/docs/en/interfaces/jdbc/) <!--hide-->

View File

@ -2,3 +2,5 @@
The native interface is used in the "clickhouse-client" command-line client for interaction between servers with distributed query processing, and also in C++ programs. We will only cover the command-line client.
[Original article](https://clickhouse.yandex/docs/en/interfaces/tcp/) <!--hide-->

View File

@ -48,3 +48,5 @@ We have not tested the libraries listed below.
- Nim
- [nim-clickhouse](https://github.com/leonardoce/nim-clickhouse)
[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party_client_libraries/) <!--hide-->

View File

@ -45,3 +45,5 @@ Key features:
- Query development with syntax highlight.
- Table preview.
- Autocompletion.
[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party_gui/) <!--hide-->

View File

@ -60,3 +60,5 @@ ClickHouse provides various ways to trade accuracy for performance:
Uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the remaining replicas in the background. The system maintains identical data on different replicas. Recovery after most failures is performed automatically, and in complex cases — semi-automatically.
For more information, see the section [Data replication](../operations/table_engines/replication.md#table_engines-replication).
[Original article](https://clickhouse.yandex/docs/en/introduction/distinctive_features/) <!--hide-->

View File

@ -3,3 +3,5 @@
1. No full-fledged transactions.
2. Lack of ability to modify or delete already inserted data with high rate and low latency. There are batch deletes and updates available to clean up or modify data, for example to comply with [GDPR](https://gdpr-info.eu).
3. The sparse index makes ClickHouse not really suitable for point queries retrieving single rows by their keys.
[Original article](https://clickhouse.yandex/docs/en/introduction/features_considered_disadvantages/) <!--hide-->

View File

@ -21,3 +21,5 @@ Under the same conditions, ClickHouse can handle several hundred queries per sec
## Performance When Inserting Data
We recommend inserting data in packets of at least 1000 rows, or no more than a single request per second. When inserting to a MergeTree table from a tab-separated dump, the insertion speed will be from 50 to 200 MB/s. If the inserted rows are around 1 Kb in size, the speed will be from 50,000 to 200,000 rows per second. If the rows are small, the performance will be higher in rows per second (on Banner System data -`>` 500,000 rows per second; on Graphite data -`>` 1,000,000 rows per second). To improve performance, you can make multiple INSERT queries in parallel, and performance will increase linearly.
[Original article](https://clickhouse.yandex/docs/en/introduction/performance/) <!--hide-->

View File

@ -46,3 +46,5 @@ OLAPServer worked well for non-aggregated data, but it had many restrictions tha
To remove the limitations of OLAPServer and solve the problem of working with non-aggregated data for all reports, we developed the ClickHouse DBMS.
[Original article](https://clickhouse.yandex/docs/en/introduction/ya_metrika_task/) <!--hide-->

View File

@ -99,3 +99,5 @@ The user can get a list of all databases and tables in them by using `SHOW` quer
Database access is not related to the [readonly](settings/query_complexity.md#query_complexity_readonly) setting. You can't grant full access to one database and `readonly` access to another one.
[Original article](https://clickhouse.yandex/docs/en/operations/access_rights/) <!--hide-->

View File

@ -40,3 +40,5 @@ $ cat /etc/clickhouse-server/users.d/alice.xml
For each config file, the server also generates `file-preprocessed.xml` files when starting. These files contain all the completed substitutions and overrides, and they are intended for informational use. If ZooKeeper substitutions were used in the config files but ZooKeeper is not available on the server start, the server loads the configuration from the preprocessed file.
The server tracks changes in config files, as well as files and ZooKeeper nodes that were used when performing substitutions and overrides, and reloads the settings for users and clusters on the fly. This means that you can modify the cluster, users, and their settings without restarting the server.
[Original article](https://clickhouse.yandex/docs/en/operations/configuration_files/) <!--hide-->

View File

@ -1,2 +1,4 @@
# Operations
[Original article](https://clickhouse.yandex/docs/en/operations/) <!--hide-->

View File

@ -104,3 +104,5 @@ For distributed query processing, the accumulated amounts are stored on the requ
When the server is restarted, quotas are reset.
[Original article](https://clickhouse.yandex/docs/en/operations/quotas/) <!--hide-->

View File

@ -10,3 +10,5 @@ Other settings are described in the "[Settings](../settings/index.md#settings)"
Before studying the settings, read the [Configuration files](../configuration_files.md#configuration_files) section and note the use of substitutions (the `incl` and `optional` attributes).
[Original article](https://clickhouse.yandex/docs/en/operations/server_settings/) <!--hide-->

View File

@ -717,3 +717,5 @@ For more information, see the section "[Replication](../../operations/table_engi
<zookeeper incl="zookeeper-servers" optional="true" />
```
[Original article](https://clickhouse.yandex/docs/en/operations/server_settings/settings/) <!--hide-->

View File

@ -22,3 +22,5 @@ Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you
Settings that can only be made in the server config file are not covered in this section.
[Original article](https://clickhouse.yandex/docs/en/operations/settings/) <!--hide-->

View File

@ -193,3 +193,5 @@ Maximum number of bytes (uncompressed data) that can be passed to a remote serve
## transfer_overflow_mode
What to do when the amount of data exceeds one of the limits: 'throw' or 'break'. By default, throw.
[Original article](https://clickhouse.yandex/docs/en/operations/settings/query_complexity/) <!--hide-->

View File

@ -417,3 +417,5 @@ See also the following parameters:
- [insert_quorum](#setting-insert_quorum)
- [insert_quorum_timeout](#setting-insert_quorum_timeout)
[Original article](https://clickhouse.yandex/docs/en/operations/settings/settings/) <!--hide-->

View File

@ -9,7 +9,7 @@ Example:
Install the `web` profile.
```sql
``` sql
SET profile = 'web'
```
@ -63,3 +63,5 @@ The example specifies two profiles: `default` and `web`. The `default` profile
Settings profiles can inherit from each other. To use inheritance, indicate the `profile` setting before the other settings that are listed in the profile.
[Original article](https://clickhouse.yandex/docs/en/operations/settings/settings_profiles/) <!--hide-->

View File

@ -18,7 +18,7 @@ Example: The number of SELECT queries currently running; the amount of memory in
Contains information about clusters available in the config file and the servers in them.
Columns:
```text
```
cluster String — The cluster name.
shard_num UInt32 — The shard number in the cluster, starting from 1.
shard_weight UInt32 — The relative weight of the shard when writing data.
@ -34,7 +34,7 @@ user String — The name of the user for connecting to the server.
Contains information about the columns in all tables.
You can use this table to get information similar to `DESCRIBE TABLE`, but for multiple tables at once.
```text
```
database String — The name of the database the table is in.
table String Table name.
name String — Column name.
@ -183,7 +183,7 @@ Formats:
This system table is used for implementing the `SHOW PROCESSLIST` query.
Columns:
```text
```
user String Name of the user who made the request. For distributed query processing, this is the user who helped the requestor server send the query to this server, not the user who made the distributed request on the requestor server.
address String - The IP address the request was made from. The same for distributed processing.
@ -210,14 +210,14 @@ This table can be used for monitoring. The table contains a row for every Replic
Example:
```sql
``` sql
SELECT *
FROM system.replicas
WHERE table = 'visits'
FORMAT Vertical
```
```text
```
Row 1:
──────
database: merge
@ -243,7 +243,7 @@ active_replicas: 2
Columns:
```text
```
database: Database name
table: Table name
engine: Table engine name
@ -296,7 +296,7 @@ If you don't request the last 4 columns (log_max_index, log_pointer, total_repli
For example, you can check that everything is working correctly like this:
```sql
``` sql
SELECT
database,
table,
@ -335,7 +335,7 @@ I.e. used for executing the query you are using to read from the system.settings
Columns:
```text
```
name String — Setting name.
value String — Setting value.
changed UInt8 — Whether the setting was explicitly defined in the config or explicitly changed.
@ -343,13 +343,13 @@ changed UInt8 — Whether the setting was explicitly defined in the config or ex
Example:
```sql
``` sql
SELECT *
FROM system.settings
WHERE changed
```
```text
```
┌─name───────────────────┬─value───────┬─changed─┐
│ max_threads │ 8 │ 1 │
│ use_uncompressed_cache │ 0 │ 1 │
@ -393,14 +393,14 @@ Columns:
Example:
```sql
``` sql
SELECT *
FROM system.zookeeper
WHERE path = '/clickhouse/tables/01-08/visits/replicas'
FORMAT Vertical
```
```text
```
Row 1:
──────
name: example01-08-1.yandex.ru
@ -435,3 +435,5 @@ numChildren: 7
pzxid: 987021252247
path: /clickhouse/tables/01-08/visits/replicas
```
[Original article](https://clickhouse.yandex/docs/en/operations/system_tables/) <!--hide-->

View File

@ -8,7 +8,7 @@ There is an `AggregateFunction` data type. It is a parametric data type. As para
Examples:
```sql
``` sql
CREATE TABLE t
(
column1 AggregateFunction(uniq, UInt64),
@ -33,7 +33,7 @@ Example: `uniqMerge(UserIDState)`, where `UserIDState` has the `AggregateFunctio
In other words, an aggregate function with the 'Merge' suffix takes a set of states, combines them, and returns the result.
As an example, these two queries return the same result:
```sql
``` sql
SELECT uniq(UserID) FROM table
SELECT uniqMerge(state) FROM (SELECT uniqState(UserID) AS state FROM table GROUP BY RegionID)
@ -51,7 +51,7 @@ Example:
Create an `AggregatingMergeTree` materialized view that watches the `test.visits` table:
```sql
``` sql
CREATE MATERIALIZED VIEW test.basic
ENGINE = AggregatingMergeTree(StartDate, (CounterID, StartDate), 8192)
AS SELECT
@ -65,13 +65,13 @@ GROUP BY CounterID, StartDate;
Insert data in the `test.visits` table. Data will also be inserted in the view, where it will be aggregated:
```sql
``` sql
INSERT INTO test.visits ...
```
Perform `SELECT` from the view using `GROUP BY` in order to complete data aggregation:
```sql
``` sql
SELECT
StartDate,
sumMerge(Visits) AS Visits,
@ -85,3 +85,5 @@ You can create a materialized view like this and assign a normal view to it that
Note that in most cases, using `AggregatingMergeTree` is not justified, since queries can be run efficiently enough on non-aggregated data.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/aggregatingmergetree/) <!--hide-->

View File

@ -2,7 +2,7 @@
Buffers the data to write in RAM, periodically flushing it to another table. During the read operation, data is read from the buffer and the other table simultaneously.
```text
```
Buffer(database, table, num_layers, min_time, max_time, min_rows, max_rows, min_bytes, max_bytes)
```
@ -16,7 +16,7 @@ The conditions for flushing the data are calculated separately for each of the '
Example:
```sql
``` sql
CREATE TABLE merge.hits_buffer AS merge.hits ENGINE = Buffer(merge, hits, 16, 10, 100, 10000, 1000000, 10000000, 100000000)
```
@ -52,3 +52,5 @@ A Buffer table is used when too many INSERTs are received from a large number of
Note that it doesn't make sense to insert data one row at a time, even for Buffer tables. This will only produce a speed of a few thousand rows per second, while inserting larger blocks of data can produce over a million rows per second (see the section "Performance").
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/buffer/) <!--hide-->

View File

@ -14,7 +14,7 @@ This is the main concept that allows Yandex.Metrica to work in real time.
CollapsingMergeTree accepts an additional parameter - the name of an Int8-type column that contains the row's "sign". Example:
```sql
``` sql
CollapsingMergeTree(EventDate, (CounterID, EventDate, intHash32(UniqID), VisitID), 8192, Sign)
```
@ -36,3 +36,5 @@ There are several ways to get completely "collapsed" data from a `CollapsingMerg
1. Write a query with GROUP BY and aggregate functions that accounts for the sign. For example, to calculate quantity, write 'sum(Sign)' instead of 'count()'. To calculate the sum of something, write 'sum(Sign * x)' instead of 'sum(x)', and so on, and also add 'HAVING sum(Sign) `>` 0'. Not all amounts can be calculated this way. For example, the aggregate functions 'min' and 'max' can't be rewritten.
2. If you must extract data without aggregation (for example, to check whether rows are present whose newest values match certain conditions), you can use the FINAL modifier for the FROM clause. This approach is significantly less efficient.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/collapsingmergetree/) <!--hide-->

View File

@ -12,7 +12,7 @@ ENGINE [=] Name(...) [PARTITION BY expr] [ORDER BY expr] [SAMPLE BY expr] [SETTI
For MergeTree tables, the partition expression is specified after `PARTITION BY`, the primary key after `ORDER BY`, the sampling key after `SAMPLE BY`, and `SETTINGS` can specify `index_granularity` (optional; the default value is 8192), as well as other settings from [MergeTreeSettings.h](https://github.com/yandex/ClickHouse/blob/master/dbms/src/Storages/MergeTree/MergeTreeSettings.h). The other engine parameters are specified in parentheses after the engine name, as previously. Example:
```sql
``` sql
ENGINE = ReplicatedCollapsingMergeTree('/clickhouse/tables/name', 'replica1', Sign)
PARTITION BY (toMonday(StartDate), EventType)
ORDER BY (CounterID, StartDate, intHash32(UserID))
@ -27,7 +27,7 @@ After this table is created, merge will only work for data parts that have the s
To specify a partition in ALTER PARTITION commands, specify the value of the partition expression (or a tuple). Constants and constant expressions are supported. Example:
```sql
``` sql
ALTER TABLE table DROP PARTITION (toMonday(today()), 1)
```
@ -45,3 +45,5 @@ The partition ID is its string identifier (human-readable, if possible) that is
For more examples, see the tests [`00502_custom_partitioning_local`](https://github.com/yandex/ClickHouse/blob/master/dbms/tests/queries/0_stateless/00502_custom_partitioning_local.sql) and [`00502_custom_partitioning_replicated_zookeeper`](https://github.com/yandex/ClickHouse/blob/master/dbms/tests/queries/0_stateless/00502_custom_partitioning_replicated_zookeeper.sql).
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/custom_partitioning_key/) <!--hide-->

View File

@ -39,7 +39,7 @@ As an example, consider a dictionary of `products` with the following configurat
Query the dictionary data:
```sql
``` sql
select name, type, key, attribute.names, attribute.types, bytes_allocated, element_count,source from system.dictionaries where name = 'products';
SELECT
@ -73,7 +73,7 @@ CREATE TABLE %table_name% (%fields%) engine = Dictionary(%dictionary_name%)`
Usage example:
```sql
``` sql
create table products (product_id UInt64, title String) Engine = Dictionary(products);
CREATE TABLE products
@ -92,7 +92,7 @@ Ok.
Take a look at what's in the table.
```sql
``` sql
select * from products limit 1;
SELECT *
@ -108,3 +108,5 @@ LIMIT 1
1 rows in set. Elapsed: 0.006 sec.
```
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/dictionary/) <!--hide-->

View File

@ -7,7 +7,7 @@ Reading is automatically parallelized. During a read, the table indexes on remot
The Distributed engine accepts parameters: the cluster name in the server's config file, the name of a remote database, the name of a remote table, and (optionally) a sharding key.
Example:
```text
```
Distributed(logs, default, hits[, sharding_key])
```
@ -122,3 +122,5 @@ If the server ceased to exist or had a rough restart (for example, after a devic
When the max_parallel_replicas option is enabled, query processing is parallelized across all replicas within a single shard. For more information, see the section "Settings, max_parallel_replicas".
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/distributed/) <!--hide-->

View File

@ -60,3 +60,5 @@ curl -F 'passwd=@passwd.tsv;' 'http://localhost:8123/?query=SELECT+shell,+count(
For distributed query processing, the temporary tables are sent to all the remote servers.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/external_data/) <!--hide-->

View File

@ -31,7 +31,7 @@ You may manually create this subfolder and file in server filesystem and then [A
**1.** Set up the `file_engine_table` table:
```sql
``` sql
CREATE TABLE file_engine_table (name String, value UInt32) ENGINE=File(TabSeparated)
```
@ -47,11 +47,11 @@ two 2
**3.** Query the data:
```sql
``` sql
SELECT * FROM file_engine_table
```
```text
```
┌─name─┬─value─┐
│ one │ 1 │
│ two │ 2 │
@ -76,3 +76,5 @@ $ echo -e "1,2\n3,4" | clickhouse-local -q "CREATE TABLE table (a Int64, b Int64
- `SELECT ... SAMPLE`
- Indices
- Replication
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/file/) <!--hide-->

View File

@ -27,7 +27,7 @@ The Graphite data table must contain the following fields at minimum:
Rollup pattern:
```text
```
pattern
regexp
function
@ -84,3 +84,5 @@ Example of settings:
</graphite_rollup>
```
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/graphitemergetree/) <!--hide-->

View File

@ -14,3 +14,5 @@ The table engine (type of table) determines:
When reading, the engine is only required to output the requested columns, but in some cases the engine can partially process data when responding to the request.
For most serious tasks, you should use engines from the `MergeTree` family.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/) <!--hide-->

View File

@ -2,7 +2,7 @@
A prepared data structure for JOIN that is always located in RAM.
```text
```
Join(ANY|ALL, LEFT|INNER, k1[, k2, ...])
```
@ -15,3 +15,5 @@ You can use INSERT to add data to the table, similar to the Set engine. For ANY,
Storing data on the disk is the same as for the Set engine.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/join/) <!--hide-->

View File

@ -44,7 +44,7 @@ Optional parameters:
Examples:
```sql
``` sql
CREATE TABLE queue (
timestamp UInt64,
level String,
@ -86,7 +86,7 @@ When the `MATERIALIZED VIEW` joins the engine, it starts collecting data in the
Example:
```sql
``` sql
CREATE TABLE queue (
timestamp UInt64,
level String,
@ -136,3 +136,5 @@ Similar to GraphiteMergeTree, the Kafka engine supports extended configuration u
```
For a list of possible configuration options, see the [librdkafka configuration reference](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md). Use the underscore (`_`) instead of a dot in the ClickHouse configuration. For example, `check.crcs=true` will be `<check_crcs>true</check_crcs>`.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/kafka/) <!--hide-->

View File

@ -4,3 +4,5 @@ Log differs from TinyLog in that a small file of "marks" resides with the column
For concurrent data access, the read operations can be performed simultaneously, while write operations block reads and each other.
The Log engine does not support indexes. Similarly, if writing to a table failed, the table is broken, and reading from it returns an error. The Log engine is appropriate for temporary data, write-once tables, and for testing or demonstration purposes.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/log/) <!--hide-->

View File

@ -2,3 +2,5 @@
Used for implementing materialized views (for more information, see [CREATE TABLE](../../query_language/create.md#query_language-queries-create_table)). For storing data, it uses a different engine that was specified when creating the view. When reading from a table, it just uses this engine.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/materializedview/) <!--hide-->

View File

@ -9,3 +9,5 @@ Normally, using this table engine is not justified. However, it can be used for
The Memory engine is used by the system for temporary tables with external query data (see the section "External data for processing a query"), and for implementing GLOBAL IN (see the section "IN operators").
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/memory/) <!--hide-->

View File

@ -65,3 +65,5 @@ The `Merge` type table contains a virtual `_table` column of the `String` type.
If the `WHERE/PREWHERE` clause contains conditions for the `_table` column that do not depend on other table columns (as one of the conjunction elements, or as an entire expression), these conditions are used as an index. The conditions are performed on a data set of table names to read data from, and the read operation will be performed from only those tables that the condition was triggered on.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/merge/) <!--hide-->

View File

@ -156,7 +156,7 @@ ENGINE MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDa
In this case, in queries:
```sql
``` sql
SELECT count() FROM table WHERE EventDate = toDate(now()) AND CounterID = 34
SELECT count() FROM table WHERE EventDate = toDate(now()) AND (CounterID = 34 OR CounterID = 42)
SELECT count() FROM table WHERE ((EventDate >= toDate('2014-01-01') AND EventDate <= toDate('2014-01-31')) OR EventDate = toDate('2014-05-01')) AND CounterID IN (101500, 731962, 160656) AND (CounterID = 101500 OR EventDate != toDate('2014-05-01'))
@ -168,7 +168,7 @@ The queries above show that the index is used even for complex expressions. Read
In the example below, the index can't be used.
```sql
``` sql
SELECT count() FROM table WHERE CounterID = 34 OR URL LIKE '%upyachka%'
```
@ -182,3 +182,5 @@ For concurrent table access, we use multi-versioning. In other words, when a tab
Reading from a table is automatically parallelized.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/mergetree/) <!--hide-->

View File

@ -26,3 +26,5 @@ The rest of the conditions and the `LIMIT` sampling constraint are executed in C
The `MySQL` engine does not support the [Nullable](../../data_types/nullable.md#data_type-nullable) data type, so when reading data from MySQL tables, `NULL` is converted to default values for the specified column type (usually 0 or an empty string).
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/mysql/) <!--hide-->

View File

@ -4,3 +4,5 @@ When writing to a Null table, data is ignored. When reading from a Null table, t
However, you can create a materialized view on a Null table. So the data written to the table will end up in the view.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/null/) <!--hide-->

View File

@ -6,7 +6,7 @@ The last optional parameter for the table engine is the version column. When mer
The version column must have a type from the `UInt` family, `Date`, or `DateTime`.
```sql
``` sql
ReplacingMergeTree(EventDate, (OrderID, EventDate, BannerID, ...), 8192, ver)
```
@ -16,3 +16,5 @@ Thus, `ReplacingMergeTree` is suitable for clearing out duplicate data in the b
*This engine is not used in Yandex.Metrica, but it has been applied in other Yandex projects.*
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/replacingmergetree/) <!--hide-->

View File

@ -78,7 +78,7 @@ Two parameters are also added in the beginning of the parameters list the pa
Example:
```text
```
ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/hits', '{replica}', EventDate, intHash32(UserID), (CounterID, EventDate, intHash32(UserID), EventTime), 8192)
```
@ -180,3 +180,5 @@ After this, you can launch the server, create a `MergeTree` table, move the data
## Recovery When Metadata in The ZooKeeper Cluster is Lost or Damaged
If the data in ZooKeeper was lost or damaged, you can save data by moving it to an unreplicated table as described above.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/replication/) <!--hide-->

View File

@ -9,3 +9,5 @@ Data is always located in RAM. For INSERT, the blocks of inserted data are also
For a rough server restart, the block of data on the disk might be lost or damaged. In the latter case, you may need to manually delete the file with damaged data.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/set/) <!--hide-->

View File

@ -4,13 +4,13 @@
This engine differs from `MergeTree` in that it totals data while merging.
```sql
``` sql
SummingMergeTree(EventDate, (OrderID, EventDate, BannerID, ...), 8192)
```
The columns to total are implicit. When merging, all rows with the same primary key value (in the example, OrderId, EventDate, BannerID, ...) have their values totaled in numeric columns that are not part of the primary key.
```sql
``` sql
SummingMergeTree(EventDate, (OrderID, EventDate, BannerID, ...), 8192, (Shows, Clicks, Cost, ...))
```
@ -32,7 +32,7 @@ Then this nested table is interpreted as a mapping of key `=>` (values...), and
Examples:
```text
```
[(1, 100)] + [(2, 150)] -> [(1, 100), (2, 150)]
[(1, 100)] + [(1, 150)] -> [(1, 250)]
[(1, 100)] + [(1, 150), (2, 150)] -> [(1, 250), (2, 150)]
@ -45,3 +45,5 @@ For nested data structures, you don't need to specify the columns as a list of c
This table engine is not particularly useful. Remember that when saving just pre-aggregated data, you lose some of the system's advantages.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/summingmergetree/) <!--hide-->

View File

@ -17,3 +17,5 @@ The situation when you have a large number of small tables guarantees poor produ
In Yandex.Metrica, TinyLog tables are used for intermediary data that is processed in small batches.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/tinylog/) <!--hide-->

View File

@ -23,7 +23,7 @@ respectively. For processing `POST` requests, the remote server must support
**1.** Create a `url_engine_table` table on the server :
```sql
``` sql
CREATE TABLE url_engine_table (word String, value UInt64)
ENGINE=URL('http://127.0.0.1:12345/', CSV)
```
@ -53,11 +53,11 @@ python3 server.py
**3.** Request data:
```sql
``` sql
SELECT * FROM url_engine_table
```
```text
```
┌─word──┬─value─┐
│ Hello │ 1 │
│ World │ 2 │
@ -71,3 +71,5 @@ SELECT * FROM url_engine_table
- `ALTER` and `SELECT...SAMPLE` operations.
- Indexes.
- Replication.
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/url/) <!--hide-->

View File

@ -2,3 +2,5 @@
Used for implementing views (for more information, see the `CREATE VIEW query`). It does not store data, but only stores the specified `SELECT` query. When reading from a table, it runs this query (and deletes all unnecessary columns from the query).
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/view/) <!--hide-->

View File

@ -178,7 +178,7 @@ dynamicConfigFile=/etc/zookeeper-{{ cluster['name'] }}/conf/zoo.cfg.dynamic
Java version:
```text
```
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
```
@ -226,7 +226,7 @@ JAVA_OPTS="-Xms{{ cluster.get('xms','128M') }} \
Salt init:
```text
```
description "zookeeper-{{ cluster['name'] }} centralized coordination service"
start on runlevel [2345]
@ -255,3 +255,5 @@ script
end script
```
[Original article](https://clickhouse.yandex/docs/en/operations/tips/) <!--hide-->

View File

@ -159,3 +159,5 @@ Parameters:
`clickhouse-copier` tracks the changes in `/task/path/description` and applies them on the fly. For instance, if you change the value of `max_workers`, the number of processes running tasks will also change.
[Original article](https://clickhouse.yandex/docs/en/operations/utils/clickhouse-copier/) <!--hide-->

View File

@ -71,3 +71,5 @@ Read 186 rows, 4.15 KiB in 0.035 sec., 5302 rows/sec., 118.34 KiB/sec.
├──────────┼──────────┤
...
```
[Original article](https://clickhouse.yandex/docs/en/operations/utils/clickhouse-local/) <!--hide-->

View File

@ -3,3 +3,5 @@
* [clickhouse-local](clickhouse-local.md#utils-clickhouse-local) — Allows running SQL queries on data without stopping the ClickHouse server, similar to how `awk` does this.
* [clickhouse-copier](clickhouse-copier.md#utils-clickhouse-copier) — Copies (and reshards) data from one cluster to another cluster.
[Original article](https://clickhouse.yandex/docs/en/operations/utils/) <!--hide-->

View File

@ -38,3 +38,5 @@ Merges the intermediate aggregation states in the same way as the -Merge combina
Converts an aggregate function for tables into an aggregate function for arrays that aggregates the corresponding array items and returns an array of results. For example, `sumForEach` for the arrays `[1, 2]`, `[3, 4, 5]`and`[6, 7]`returns the result `[10, 13, 5]` after adding together the corresponding array items.
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/combinators/) <!--hide-->

View File

@ -61,3 +61,5 @@ FROM t_null_big
`groupArray` does not include `NULL` in the resulting array.
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/) <!--hide-->

View File

@ -23,7 +23,7 @@ Example: `sequenceMatch ('(?1).*(?2)')(EventTime, URL LIKE '%company%', URL LIKE
This is a singular example. You could write it using other aggregate functions:
```text
```
minIf(EventTime, URL LIKE '%company%') < maxIf(EventTime, URL LIKE '%cart%').
```
@ -151,7 +151,9 @@ It works as fast as possible, except for cases when a large N value is used and
Usage example:
```text
```
Problem: Generate a report that shows only keywords that produced at least 5 unique users.
Solution: Write in the GROUP BY query SearchPhrase HAVING uniqUpTo(4)(UserID) >= 5
```
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/parametric_functions/) <!--hide-->

View File

@ -35,7 +35,7 @@ anyHeavy(column)
Take the [OnTime](../../getting_started/example_datasets/ontime.md#example_datasets-ontime) data set and select any frequently occurring value in the `AirlineID` column.
```sql
``` sql
SELECT anyHeavy(AirlineID) AS res
FROM ontime
```
@ -101,7 +101,7 @@ Returns a tuple of two arrays: keys in sorted order, and values summed for
Example:
```sql
``` sql
CREATE TABLE sum_map(
date Date,
timeslot DateTime,
@ -122,7 +122,7 @@ FROM sum_map
GROUP BY timeslot
```
```text
```
┌────────────timeslot─┬─sumMap(statusMap.status, statusMap.requests)─┐
│ 2000-01-01 00:00:00 │ ([1,2,3,4,5],[10,10,20,10,10]) │
│ 2000-01-01 00:01:00 │ ([4,5,6,7,8],[10,10,20,10,10]) │
@ -325,7 +325,7 @@ We recommend using the `N < 10 ` value; performance is reduced with large `N` va
Take the [OnTime](../../getting_started/example_datasets/ontime.md#example_datasets-ontime) data set and select the three most frequently occurring values in the `AirlineID` column.
```sql
``` sql
SELECT topK(3)(AirlineID) AS res
FROM ontime
```
@ -350,3 +350,5 @@ Calculates the value of `Σ((x - x̅)(y - y̅)) / n`.
Calculates the Pearson correlation coefficient: `Σ((x - x̅)(y - y̅)) / sqrt(Σ((x - x̅)^2) * Σ((y - y̅)^2))`.
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/reference/) <!--hide-->

View File

@ -8,7 +8,7 @@ The `ALTER` query is only supported for `*MergeTree` tables, as well as `Merge`a
Changing the table structure.
```sql
``` sql
ALTER TABLE [db].name [ON CLUSTER cluster] ADD|DROP|MODIFY COLUMN ...
```
@ -17,7 +17,7 @@ Each action is an operation on a column.
The following actions are supported:
```sql
``` sql
ADD COLUMN name [type] [default_expr] [AFTER name_after]
```
@ -27,14 +27,14 @@ Adding a column just changes the table structure, without performing any actions
This approach allows us to complete the ALTER query instantly, without increasing the volume of old data.
```sql
``` sql
DROP COLUMN name
```
Deletes the column with the name 'name'.
Deletes data from the file system. Since this deletes entire files, the query is completed almost instantly.
```sql
``` sql
MODIFY COLUMN name [type] [default_expr]
```
@ -86,7 +86,7 @@ A "part" in the table is part of the data from a single partition, sorted by the
You can use the `system.parts` table to view the set of table parts and partitions:
```sql
``` sql
SELECT * FROM system.parts WHERE active
```
@ -123,7 +123,7 @@ For replicated tables, the set of parts can't be changed in any case.
The `detached` directory contains parts that are not used by the server - detached from the table using the `ALTER ... DETACH` query. Parts that are damaged are also moved to this directory, instead of deleting them. You can add, delete, or modify the data in the 'detached' directory at any time the server won't know about this until you make the `ALTER TABLE ... ATTACH` query.
```sql
``` sql
ALTER TABLE [db.]table DETACH PARTITION 'name'
```
@ -134,13 +134,13 @@ After the query is executed, you can do whatever you want with the data in the '
The query is replicated data will be moved to the 'detached' directory and forgotten on all replicas. The query can only be sent to a leader replica. To find out if a replica is a leader, perform SELECT to the 'system.replicas' system table. Alternatively, it is easier to make a query on all replicas, and all except one will throw an exception.
```sql
``` sql
ALTER TABLE [db.]table DROP PARTITION 'name'
```
The same as the `DETACH` operation. Deletes data from the table. Data parts will be tagged as inactive and will be completely deleted in approximately 10 minutes. The query is replicated data will be deleted on all replicas.
```sql
``` sql
ALTER TABLE [db.]table ATTACH PARTITION|PART 'name'
```
@ -152,7 +152,7 @@ The query is replicated. Each replica checks whether there is data in the 'detac
So you can put data in the 'detached' directory on one replica, and use the ALTER ... ATTACH query to add it to the table on all replicas.
```sql
``` sql
ALTER TABLE [db.]table FREEZE PARTITION 'name'
```
@ -196,7 +196,7 @@ For protection from device failures, you must use replication. For more informat
Backups protect against human error (accidentally deleting data, deleting the wrong data or in the wrong cluster, or corrupting data).
For high-volume databases, it can be difficult to copy backups to remote servers. In such cases, to protect from human error, you can keep a backup on the same server (it will reside in `/var/lib/clickhouse/shadow/`).
```sql
``` sql
ALTER TABLE [db.]table FETCH PARTITION 'name' FROM 'path-in-zookeeper'
```
@ -232,13 +232,13 @@ Existing tables are ready for mutations as-is (no conversion necessary), but aft
Currently available commands:
```sql
``` sql
ALTER TABLE [db.]table DELETE WHERE filter_expr
```
The `filter_expr` must be of type UInt8. The query deletes rows in the table for which this expression takes a non-zero value.
```sql
``` sql
ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr
```
@ -272,3 +272,5 @@ The table contains information about mutations of MergeTree tables and their pro
**is_done** - Is the mutation done? Note that even if `parts_to_do = 0` it is possible that a mutation of a replicated table is not done yet because of a long-running INSERT that will create a new data part that will need to be mutated.
[Original article](https://clickhouse.yandex/docs/en/query_language/alter/) <!--hide-->

View File

@ -2,7 +2,7 @@
Creating db_name databases
```sql
``` sql
CREATE DATABASE [IF NOT EXISTS] db_name
```
@ -15,7 +15,7 @@ If `IF NOT EXISTS` is included, the query won't return an error if the database
The `CREATE TABLE` query can have several forms.
```sql
``` sql
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
@ -30,13 +30,13 @@ The structure of the table is a list of column descriptions. If indexes are supp
A column description is `name type` in the simplest case. Example: `RegionID UInt32`.
Expressions can also be defined for default values (see below).
```sql
``` sql
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name AS [db2.]name2 [ENGINE = engine]
```
Creates a table with the same structure as another table. You can specify a different engine for the table. If the engine is not specified, the same engine will be used as for the `db2.name2` table.
```sql
``` sql
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name ENGINE = engine AS SELECT ...
```
@ -97,7 +97,7 @@ Distributed DDL queries (ON CLUSTER clause)
The `CREATE`, `DROP`, `ALTER`, and `RENAME` queries support distributed execution on a cluster.
For example, the following query creates the `all_hits` `Distributed` table on each host in `cluster`:
```sql
``` sql
CREATE TABLE IF NOT EXISTS all_hits ON CLUSTER cluster (p Date, i Int32) ENGINE = Distributed(cluster, default, hits)
```
@ -107,7 +107,7 @@ The local version of the query will eventually be implemented on each host in th
## CREATE VIEW
```sql
``` sql
CREATE [MATERIALIZED] VIEW [IF NOT EXISTS] [db.]name [TO[db.]name] [ENGINE = engine] [POPULATE] AS SELECT ...
```
@ -121,19 +121,19 @@ Normal views don't store any data, but just perform a read from another table. I
As an example, assume you've created a view:
```sql
``` sql
CREATE VIEW view AS SELECT ...
```
and written a query:
```sql
``` sql
SELECT a, b, c FROM view
```
This query is fully equivalent to using the subquery:
```sql
``` sql
SELECT a, b, c FROM (SELECT ...)
```
@ -152,3 +152,5 @@ The execution of `ALTER` queries on materialized views has not been fully develo
Views look the same as normal tables. For example, they are listed in the result of the `SHOW TABLES` query.
There isn't a separate query for deleting views. To delete a view, use `DROP TABLE`.
[Original article](https://clickhouse.yandex/docs/en/query_language/create/) <!--hide-->

View File

@ -41,3 +41,5 @@ See also "[Functions for working with external dictionaries](../functions/ext_di
!!! attention
You can convert values for a small dictionary by describing it in a `SELECT` query (see the [transform](../functions/other_functions.md#other_functions-transform) function). This functionality is not related to external dictionaries.
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts/) <!--hide-->

View File

@ -31,3 +31,5 @@ The dictionary configuration has the following structure:
- [layout](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout) — Dictionary layout in memory.
- [structure](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure) — Structure of the dictionary . A key and attributes that can be retrieved by this key.
- [lifetime](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime) — Frequency of dictionary updates.
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict/) <!--hide-->

View File

@ -292,3 +292,5 @@ dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
Other types are not supported yet. The function returns the attribute for the prefix that corresponds to this IP address. If there are overlapping prefixes, the most specific one is returned.
Data is stored in a `trie`. It must completely fit into RAM.
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_layout/) <!--hide-->

View File

@ -57,3 +57,5 @@ Example of settings:
</dictionary>
```
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_lifetime/) <!--hide-->

View File

@ -427,3 +427,5 @@ Setting fields:
- `password` Password of the MongoDB user.
- `db` Name of the database.
- `collection` Name of the collection.
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_sources/) <!--hide-->

Some files were not shown because too many files have changed in this diff Show More