mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-21 15:12:02 +00:00
WIP on docs/website (#3383)
* CLICKHOUSE-4063: less manual html @ index.md * CLICKHOUSE-4063: recommend markdown="1" in README.md * CLICKHOUSE-4003: manually purge custom.css for now * CLICKHOUSE-4064: expand <details> before any print (including to pdf) * CLICKHOUSE-3927: rearrange interfaces/formats.md a bit * CLICKHOUSE-3306: add few http headers * Remove copy-paste introduced in #3392 * Hopefully better chinese fonts #3392 * get rid of tabs @ custom.css * Apply comments and patch from #3384 * Add jdbc.md to ToC and some translation, though it still looks badly incomplete * minor punctuation * Add some backlinks to official website from mirrors that just blindly take markdown sources * Do not make fonts extra light * find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {} * find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {} * Remove outdated stuff from roadmap.md * Not so light font on front page too * Refactor Chinese formats.md to match recent changes in other languages
This commit is contained in:
parent
3359ba06c3
commit
8623cb232c
@ -48,7 +48,7 @@ Some additional configuration has to be done to actually make new language live
|
||||
* Inline piece of code is <code>`in backticks`</code>.
|
||||
* Multiline code block are <code>```in triple backtick quotes ```</code>.
|
||||
* Brightly highlighted block of text starts with `!!! info "Header"`, on next line 4 spaces and content. Instead of `info` can be `warning`.
|
||||
* Hide block to be opened by click: `<details> <summary>Header</summary> hidden content</details>`.
|
||||
* Hide block to be opened by click: `<details markdown="1"> <summary>Header</summary> hidden content</details>`.
|
||||
* Colored text: `<span style="color: red;">text</span>`.
|
||||
* Additional anchor to be linked to: `<a name="my_anchor"></a>`, for headers fully in English they are created automatically like `"FoO Bar" -> "foo-bar"`.
|
||||
* Table:
|
||||
|
@ -83,3 +83,5 @@ Code: 386. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception
|
||||
0 rows in set. Elapsed: 0.246 sec.
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/array/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
There isn't a separate type for boolean values. They use the UInt8 type, restricted to the values 0 or 1.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/boolean/) <!--hide-->
|
||||
|
@ -5,3 +5,5 @@ The minimum value is output as 0000-00-00.
|
||||
|
||||
The date is stored without the time zone.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/date/) <!--hide-->
|
||||
|
@ -13,3 +13,5 @@ By default, the client switches to the timezone of the server when it connects.
|
||||
|
||||
So when working with a textual date (for example, when saving text dumps), keep in mind that there may be ambiguity during changes for daylight savings time, and there may be problems matching data if the time zone changed.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/datetime/) <!--hide-->
|
||||
|
@ -95,3 +95,5 @@ SELECT toDecimal32(1, 8) < 100
|
||||
```
|
||||
DB::Exception: Can't compare.
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/decimal/) <!--hide-->
|
||||
|
@ -113,3 +113,5 @@ The Enum type can be changed without cost using ALTER, if only the set of values
|
||||
|
||||
Using ALTER, it is possible to change an Enum8 to an Enum16 or vice versa, just like changing an Int8 to Int16.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/enum/) <!--hide-->
|
||||
|
@ -8,3 +8,5 @@ Note that this behavior differs from MySQL behavior for the CHAR type (where str
|
||||
|
||||
Fewer functions can work with the FixedString(N) type than with String, so it is less convenient to use.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/fixedstring/) <!--hide-->
|
||||
|
@ -13,7 +13,7 @@ We recommend that you store data in integer form whenever possible. For example,
|
||||
|
||||
- Computations with floating-point numbers might produce a rounding error.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT 1 - 0.9
|
||||
```
|
||||
|
||||
@ -33,7 +33,7 @@ In contrast to standard SQL, ClickHouse supports the following categories of flo
|
||||
|
||||
- `Inf` – Infinity.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT 0.5 / 0
|
||||
```
|
||||
|
||||
@ -45,7 +45,7 @@ SELECT 0.5 / 0
|
||||
|
||||
- `-Inf` – Negative infinity.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT -0.5 / 0
|
||||
```
|
||||
|
||||
@ -69,3 +69,5 @@ SELECT 0 / 0
|
||||
|
||||
See the rules for `NaN` sorting in the section [ORDER BY clause](../query_language/select.md#query_language-queries-order_by).
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/float/) <!--hide-->
|
||||
|
@ -6,3 +6,5 @@ ClickHouse can store various types of data in table cells.
|
||||
|
||||
This section describes the supported data types and special considerations when using and/or implementing them, if any.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/) <!--hide-->
|
||||
|
@ -18,3 +18,5 @@ Fixed-length integers, with or without a sign.
|
||||
- UInt32 - [0 : 4294967295]
|
||||
- UInt64 - [0 : 18446744073709551615]
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/int_uint/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
The intermediate state of an aggregate function. To get it, use aggregate functions with the '-State' suffix. For more information, see "AggregatingMergeTree".
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/nested_data_structures/aggregatefunction/) <!--hide-->
|
||||
|
@ -1,2 +1,4 @@
|
||||
# Nested Data Structures
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/nested_data_structures/) <!--hide-->
|
||||
|
@ -4,7 +4,7 @@ A nested data structure is like a nested table. The parameters of a nested data
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE test.visits
|
||||
(
|
||||
CounterID UInt32,
|
||||
@ -35,7 +35,7 @@ In most cases, when working with a nested data structure, its individual columns
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT
|
||||
Goals.ID,
|
||||
Goals.EventTime
|
||||
@ -44,7 +44,7 @@ WHERE CounterID = 101500 AND length(Goals.ID) < 5
|
||||
LIMIT 10
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
┌─Goals.ID───────────────────────┬─Goals.EventTime───────────────────────────────────────────────────────────────────────────┐
|
||||
│ [1073752,591325,591325] │ ['2014-03-17 16:38:10','2014-03-17 16:38:48','2014-03-17 16:42:27'] │
|
||||
│ [1073752] │ ['2014-03-17 00:28:25'] │
|
||||
@ -63,7 +63,7 @@ It is easiest to think of a nested data structure as a set of multiple column ar
|
||||
|
||||
The only place where a SELECT query can specify the name of an entire nested data structure instead of individual columns is the ARRAY JOIN clause. For more information, see "ARRAY JOIN clause". Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT
|
||||
Goal.ID,
|
||||
Goal.EventTime
|
||||
@ -73,7 +73,7 @@ WHERE CounterID = 101500 AND length(Goals.ID) < 5
|
||||
LIMIT 10
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
┌─Goal.ID─┬──────Goal.EventTime─┐
|
||||
│ 1073752 │ 2014-03-17 16:38:10 │
|
||||
│ 591325 │ 2014-03-17 16:38:48 │
|
||||
@ -96,3 +96,5 @@ For a DESCRIBE query, the columns in a nested data structure are listed separate
|
||||
|
||||
The ALTER query is very limited for elements in a nested data structure.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/nested_data_structures/nested/) <!--hide-->
|
||||
|
@ -53,3 +53,5 @@ FROM t_null
|
||||
|
||||
2 rows in set. Elapsed: 0.144 sec.
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/nullable/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
Used for representing lambda expressions in high-order functions.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/expression/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
Special data type values can't be saved to a table or output in results, but are used as the intermediate result of running a query.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/) <!--hide-->
|
||||
|
@ -20,3 +20,5 @@ SELECT toTypeName([])
|
||||
1 rows in set. Elapsed: 0.062 sec.
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/nothing/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
Used for the right half of an IN expression.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/special_data_types/set/) <!--hide-->
|
||||
|
@ -12,3 +12,5 @@ If you need to store texts, we recommend using UTF-8 encoding. At the very least
|
||||
Similarly, certain functions for working with strings have separate variations that work under the assumption that the string contains a set of bytes representing a UTF-8 encoded text.
|
||||
For example, the 'length' function calculates the string length in bytes, while the 'lengthUTF8' function calculates the string length in Unicode code points, assuming that the value is UTF-8 encoded.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/string/) <!--hide-->
|
||||
|
@ -52,3 +52,5 @@ SELECT
|
||||
1 rows in set. Elapsed: 0.002 sec.
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/data_types/tuple/) <!--hide-->
|
||||
|
@ -193,3 +193,5 @@ In addition, each replica stores its state in ZooKeeper as the set of parts and
|
||||
|
||||
> The ClickHouse cluster consists of independent shards, and each shard consists of replicas. The cluster is not elastic, so after adding a new shard, data is not rebalanced between shards automatically. Instead, the cluster load will be uneven. This implementation gives you more control, and it is fine for relatively small clusters such as tens of nodes. But for clusters with hundreds of nodes that we are using in production, this approach becomes a significant drawback. We should implement a table engine that will span its data across the cluster with dynamically replicated regions that could be split and balanced between clusters automatically.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/development/architecture/) <!--hide-->
|
||||
|
@ -95,3 +95,5 @@ cd ..
|
||||
To create an executable, run `ninja clickhouse`.
|
||||
This will create the `dbms/programs/clickhouse` executable, which can be used with `client` or `server` arguments.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/development/build/) <!--hide-->
|
||||
|
@ -79,3 +79,5 @@ Reboot.
|
||||
|
||||
To check if it's working, you can use `ulimit -n` command.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/development/build_osx/) <!--hide-->
|
||||
|
@ -1,2 +1,4 @@
|
||||
# ClickHouse Development
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/development/) <!--hide-->
|
||||
|
@ -834,3 +834,5 @@ function(
|
||||
const & RangesInDataParts ranges,
|
||||
size_t limit)
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/development/style/) <!--hide-->
|
||||
|
@ -249,3 +249,5 @@ In Travis CI due to limit on time and computational power we can afford only sub
|
||||
In Jenkins we run functional tests for each commit and for each pull request from trusted users; the same under ASan; we also run quorum tests, dictionary tests, Metrica B2B tests. We use Jenkins to prepare and publish releases. Worth to note that we are not happy with Jenkins at all.
|
||||
|
||||
One of our goals is to provide reliable testing infrastructure that will be available to community.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/development/tests/) <!--hide-->
|
||||
|
@ -11,3 +11,5 @@ Distributed sorting is one of the main causes of reduced performance when runnin
|
||||
|
||||
Most MapReduce implementations allow you to execute arbitrary code on a cluster. But a declarative query language is better suited to OLAP in order to run experiments quickly. For example, Hadoop has Hive and Pig. Also consider Cloudera Impala or Shark (outdated) for Spark, as well as Spark SQL, Presto, and Apache Drill. Performance when running such tasks is highly sub-optimal compared to specialized systems, but relatively high latency makes it unrealistic to use these systems as the backend for a web interface.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/faq/general/) <!--hide-->
|
||||
|
@ -21,7 +21,7 @@ cd ..
|
||||
|
||||
Run the following ClickHouse queries:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE rankings_tiny
|
||||
(
|
||||
pageURL String,
|
||||
@ -96,7 +96,7 @@ for i in 5nodes/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i |
|
||||
|
||||
Queries for obtaining data samples:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT pageURL, pageRank FROM rankings_1node WHERE pageRank > 1000
|
||||
|
||||
SELECT substring(sourceIP, 1, 8), sum(adRevenue) FROM uservisits_1node GROUP BY substring(sourceIP, 1, 8)
|
||||
@ -119,3 +119,5 @@ ORDER BY totalRevenue DESC
|
||||
LIMIT 1
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/amplab_benchmark/) <!--hide-->
|
||||
|
@ -4,7 +4,7 @@ Download the data from <http://labs.criteo.com/downloads/download-terabyte-click
|
||||
|
||||
Create a table to import the log to:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE criteo_log (date Date, clicked UInt8, int1 Int32, int2 Int32, int3 Int32, int4 Int32, int5 Int32, int6 Int32, int7 Int32, int8 Int32, int9 Int32, int10 Int32, int11 Int32, int12 Int32, int13 Int32, cat1 String, cat2 String, cat3 String, cat4 String, cat5 String, cat6 String, cat7 String, cat8 String, cat9 String, cat10 String, cat11 String, cat12 String, cat13 String, cat14 String, cat15 String, cat16 String, cat17 String, cat18 String, cat19 String, cat20 String, cat21 String, cat22 String, cat23 String, cat24 String, cat25 String, cat26 String) ENGINE = Log
|
||||
```
|
||||
|
||||
@ -16,7 +16,7 @@ for i in {00..23}; do echo $i; zcat datasets/criteo/day_${i#0}.gz | sed -r 's/^/
|
||||
|
||||
Create a table for the converted data:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE criteo
|
||||
(
|
||||
date Date,
|
||||
@ -65,9 +65,11 @@ CREATE TABLE criteo
|
||||
|
||||
Transform data from the raw log and put it in the second table:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
INSERT INTO criteo SELECT date, clicked, int1, int2, int3, int4, int5, int6, int7, int8, int9, int10, int11, int12, int13, reinterpretAsUInt32(unhex(cat1)) AS icat1, reinterpretAsUInt32(unhex(cat2)) AS icat2, reinterpretAsUInt32(unhex(cat3)) AS icat3, reinterpretAsUInt32(unhex(cat4)) AS icat4, reinterpretAsUInt32(unhex(cat5)) AS icat5, reinterpretAsUInt32(unhex(cat6)) AS icat6, reinterpretAsUInt32(unhex(cat7)) AS icat7, reinterpretAsUInt32(unhex(cat8)) AS icat8, reinterpretAsUInt32(unhex(cat9)) AS icat9, reinterpretAsUInt32(unhex(cat10)) AS icat10, reinterpretAsUInt32(unhex(cat11)) AS icat11, reinterpretAsUInt32(unhex(cat12)) AS icat12, reinterpretAsUInt32(unhex(cat13)) AS icat13, reinterpretAsUInt32(unhex(cat14)) AS icat14, reinterpretAsUInt32(unhex(cat15)) AS icat15, reinterpretAsUInt32(unhex(cat16)) AS icat16, reinterpretAsUInt32(unhex(cat17)) AS icat17, reinterpretAsUInt32(unhex(cat18)) AS icat18, reinterpretAsUInt32(unhex(cat19)) AS icat19, reinterpretAsUInt32(unhex(cat20)) AS icat20, reinterpretAsUInt32(unhex(cat21)) AS icat21, reinterpretAsUInt32(unhex(cat22)) AS icat22, reinterpretAsUInt32(unhex(cat23)) AS icat23, reinterpretAsUInt32(unhex(cat24)) AS icat24, reinterpretAsUInt32(unhex(cat25)) AS icat25, reinterpretAsUInt32(unhex(cat26)) AS icat26 FROM criteo_log;
|
||||
|
||||
DROP TABLE criteo_log;
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/criteo/) <!--hide-->
|
||||
|
File diff suppressed because one or more lines are too long
@ -18,7 +18,7 @@ done
|
||||
|
||||
Creating a table:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE `ontime` (
|
||||
`Year` UInt16,
|
||||
`Quarter` UInt8,
|
||||
@ -142,37 +142,37 @@ Queries:
|
||||
|
||||
Q0.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
select avg(c1) from (select Year, Month, count(*) as c1 from ontime group by Year, Month);
|
||||
```
|
||||
|
||||
Q1. The number of flights per day from the year 2000 to 2008
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
|
||||
```
|
||||
|
||||
Q2. The number of flights delayed by more than 10 minutes, grouped by the day of the week, for 2000-2008
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT DayOfWeek, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC
|
||||
```
|
||||
|
||||
Q3. The number of delays by airport for 2000-2008
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10
|
||||
```
|
||||
|
||||
Q4. The number of delays by carrier for 2007
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Carrier, count(*) FROM ontime WHERE DepDelay>10 AND Year = 2007 GROUP BY Carrier ORDER BY count(*) DESC
|
||||
```
|
||||
|
||||
Q5. The percentage of delays by carrier for 2007
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Carrier, c, c2, c*1000/c2 as c3
|
||||
FROM
|
||||
(
|
||||
@ -198,13 +198,13 @@ ORDER BY c3 DESC;
|
||||
|
||||
Better version of the same query:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Carrier, avg(DepDelay > 10) * 1000 AS c3 FROM ontime WHERE Year = 2007 GROUP BY Carrier ORDER BY Carrier
|
||||
```
|
||||
|
||||
Q6. The previous request for a broader range of years, 2000-2008
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Carrier, c, c2, c*1000/c2 as c3
|
||||
FROM
|
||||
(
|
||||
@ -230,13 +230,13 @@ ORDER BY c3 DESC;
|
||||
|
||||
Better version of the same query:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Carrier, avg(DepDelay > 10) * 1000 AS c3 FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier ORDER BY Carrier
|
||||
```
|
||||
|
||||
Q7. Percentage of flights delayed for more than 10 minutes, by year
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Year, c1/c2
|
||||
FROM
|
||||
(
|
||||
@ -260,25 +260,25 @@ ORDER BY Year
|
||||
|
||||
Better version of the same query:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT Year, avg(DepDelay > 10) FROM ontime GROUP BY Year ORDER BY Year
|
||||
```
|
||||
|
||||
Q8. The most popular destinations by the number of directly connected cities for various year ranges
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT DestCityName, uniqExact(OriginCityName) AS u FROM ontime WHERE Year >= 2000 and Year <= 2010 GROUP BY DestCityName ORDER BY u DESC LIMIT 10;
|
||||
```
|
||||
|
||||
Q9.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
select Year, count(*) as c1 from ontime group by Year;
|
||||
```
|
||||
|
||||
Q10.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
select
|
||||
min(Year), max(Year), Carrier, count(*) as cnt,
|
||||
sum(ArrDelayMinutes>30) as flights_delayed,
|
||||
@ -296,7 +296,7 @@ LIMIT 1000;
|
||||
|
||||
Bonus:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT avg(cnt) FROM (SELECT Year,Month,count(*) AS cnt FROM ontime WHERE DepDel15=1 GROUP BY Year,Month)
|
||||
|
||||
select avg(c1) from (select Year,Month,count(*) as c1 from ontime group by Year,Month)
|
||||
@ -317,3 +317,5 @@ This performance test was created by Vadim Tkachenko. See:
|
||||
- <https://www.percona.com/blog/2016/01/07/apache-spark-with-air-ontime-performance-data/>
|
||||
- <http://nickmakos.blogspot.ru/2012/08/analyzing-air-traffic-performance-with.html>
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/ontime/) <!--hide-->
|
||||
|
@ -21,7 +21,7 @@ Generating data:
|
||||
|
||||
Creating tables in ClickHouse:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE lineorder (
|
||||
LO_ORDERKEY UInt32,
|
||||
LO_LINENUMBER UInt8,
|
||||
@ -83,3 +83,5 @@ cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INT
|
||||
cat lineorder.tbl | clickhouse-client --query "INSERT INTO lineorder FORMAT CSV"
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/star_schema/) <!--hide-->
|
||||
|
@ -4,7 +4,7 @@ See: <http://dumps.wikimedia.org/other/pagecounts-raw/>
|
||||
|
||||
Creating a table:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE wikistat
|
||||
(
|
||||
date Date,
|
||||
@ -25,3 +25,5 @@ cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pageco
|
||||
ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/getting_started/example_datasets/wikistat/) <!--hide-->
|
||||
|
@ -24,7 +24,7 @@ For testing and development, the system can be installed on a single server or o
|
||||
|
||||
In `/etc/apt/sources.list` (or in a separate `/etc/apt/sources.list.d/clickhouse.list` file), add the repository:
|
||||
|
||||
```text
|
||||
```
|
||||
deb http://repo.yandex.ru/clickhouse/deb/stable/ main/
|
||||
```
|
||||
|
||||
@ -51,14 +51,14 @@ To compile, follow the instructions: build.md
|
||||
You can compile packages and install them.
|
||||
You can also use programs without installing packages.
|
||||
|
||||
```text
|
||||
```
|
||||
Client: dbms/programs/clickhouse-client
|
||||
Server: dbms/programs/clickhouse-server
|
||||
```
|
||||
|
||||
For the server, create a catalog with data, such as:
|
||||
|
||||
```text
|
||||
```
|
||||
/opt/clickhouse/data/default/
|
||||
/opt/clickhouse/metadata/default/
|
||||
```
|
||||
@ -137,3 +137,5 @@ SELECT 1
|
||||
|
||||
To continue experimenting, you can try to download from the test data sets.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/getting_started/) <!--hide-->
|
||||
|
@ -77,9 +77,8 @@ See the difference?
|
||||
|
||||
For example, the query "count the number of records for each advertising platform" requires reading one "advertising platform ID" column, which takes up 1 byte uncompressed. If most of the traffic was not from advertising platforms, you can expect at least 10-fold compression of this column. When using a quick compression algorithm, data decompression is possible at a speed of at least several gigabytes of uncompressed data per second. In other words, this query can be processed at a speed of approximately several billion rows per second on a single server. This speed is actually achieved in practice.
|
||||
|
||||
<details><summary>Example</summary>
|
||||
<p>
|
||||
<pre>
|
||||
<details markdown="1"><summary>Example</summary>
|
||||
```
|
||||
$ clickhouse-client
|
||||
ClickHouse client version 0.0.52053.
|
||||
Connecting to localhost:9000.
|
||||
@ -120,9 +119,10 @@ LIMIT 20
|
||||
|
||||
20 rows in set. Elapsed: 0.153 sec. Processed 1.00 billion rows, 4.00 GB (6.53 billion rows/s., 26.10 GB/s.)
|
||||
|
||||
:)</pre>
|
||||
:)
|
||||
```
|
||||
|
||||
</p></details>
|
||||
</details>
|
||||
|
||||
### CPU
|
||||
|
||||
@ -138,3 +138,5 @@ There are two ways to do this:
|
||||
This is not done in "normal" databases, because it doesn't make sense when running simple queries. However, there are exceptions. For example, MemSQL uses code generation to reduce latency when processing SQL queries. (For comparison, analytical DBMSs require optimization of throughput, not latency.)
|
||||
|
||||
Note that for CPU efficiency, the query language must be declarative (SQL or MDX), or at least a vector (J, K). The query should only contain implicit loops, allowing for optimization.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/) <!--hide-->
|
||||
|
@ -113,3 +113,5 @@ Example of a config file:
|
||||
</config>
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/cli/) <!--hide-->
|
||||
|
@ -32,31 +32,131 @@ The table below lists supported formats and how they can be used in `INSERT` and
|
||||
| [XML](#xml) | ✗ | ✔ |
|
||||
| [CapnProto](#capnproto) | ✔ | ✔ |
|
||||
|
||||
<a name="format_capnproto"></a>
|
||||
<a name="tabseparated"></a>
|
||||
|
||||
## CapnProto
|
||||
## TabSeparated
|
||||
|
||||
Cap'n Proto is a binary message format similar to Protocol Buffers and Thrift, but not like JSON or MessagePack.
|
||||
In TabSeparated format, data is written by row. Each row contains values separated by tabs. Each value is follow by a tab, except the last value in the row, which is followed by a line feed. Strictly Unix line feeds are assumed everywhere. The last row also must contain a line feed at the end. Values are written in text format, without enclosing quotation marks, and with special characters escaped.
|
||||
|
||||
Cap'n Proto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query.
|
||||
This format is also available under the name `TSV`.
|
||||
|
||||
```sql
|
||||
SELECT SearchPhrase, count() AS c FROM test.hits
|
||||
GROUP BY SearchPhrase FORMAT CapnProto SETTINGS schema = 'schema:Message'
|
||||
The `TabSeparated` format is convenient for processing data using custom programs and scripts. It is used by default in the HTTP interface, and in the command-line client's batch mode. This format also allows transferring data between different DBMSs. For example, you can get a dump from MySQL and upload it to ClickHouse, or vice versa.
|
||||
|
||||
The `TabSeparated` format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example:
|
||||
|
||||
``` sql
|
||||
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
|
||||
```
|
||||
|
||||
Where `schema.capnp` looks like this:
|
||||
|
||||
```
|
||||
struct Message {
|
||||
SearchPhrase @0 :Text;
|
||||
c @1 :Uint64;
|
||||
}
|
||||
2014-03-17 1406958
|
||||
2014-03-18 1383658
|
||||
2014-03-19 1405797
|
||||
2014-03-20 1353623
|
||||
2014-03-21 1245779
|
||||
2014-03-22 1031592
|
||||
2014-03-23 1046491
|
||||
|
||||
0000-00-00 8873898
|
||||
|
||||
2014-03-17 1031592
|
||||
2014-03-23 1406958
|
||||
```
|
||||
|
||||
Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md#server_settings-format_schema_path) in the server configuration.
|
||||
### Data formatting
|
||||
|
||||
Integer numbers are written in decimal form. Numbers can contain an extra "+" character at the beginning (ignored when parsing, and not recorded when formatting). Non-negative numbers can't contain the negative sign. When reading, it is allowed to parse an empty string as a zero, or (for signed types) a string consisting of just a minus sign as a zero. Numbers that do not fit into the corresponding data type may be parsed as a different number, without an error message.
|
||||
|
||||
Floating-point numbers are written in decimal form. The dot is used as the decimal separator. Exponential entries are supported, as are 'inf', '+inf', '-inf', and 'nan'. An entry of floating-point numbers may begin or end with a decimal point.
|
||||
During formatting, accuracy may be lost on floating-point numbers.
|
||||
During parsing, it is not strictly required to read the nearest machine-representable number.
|
||||
|
||||
Dates are written in YYYY-MM-DD format and parsed in the same format, but with any characters as separators.
|
||||
Dates with times are written in the format YYYY-MM-DD hh:mm:ss and parsed in the same format, but with any characters as separators.
|
||||
This all occurs in the system time zone at the time the client or server starts (depending on which one formats data). For dates with times, daylight saving time is not specified. So if a dump has times during daylight saving time, the dump does not unequivocally match the data, and parsing will select one of the two times.
|
||||
During a read operation, incorrect dates and dates with times can be parsed with natural overflow or as null dates and times, without an error message.
|
||||
|
||||
As an exception, parsing dates with times is also supported in Unix timestamp format, if it consists of exactly 10 decimal digits. The result is not time zone-dependent. The formats YYYY-MM-DD hh:mm:ss and NNNNNNNNNN are differentiated automatically.
|
||||
|
||||
Strings are output with backslash-escaped special characters. The following escape sequences are used for output: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\'`, `\\`. Parsing also supports the sequences `\a`, `\v`, and `\xHH` (hex escape sequences) and any `\c` sequences, where `c` is any character (these sequences are converted to `c`). Thus, reading data supports formats where a line feed can be written as `\n` or `\`, or as a line feed. For example, the string `Hello world` with a line feed between the words instead of a space can be parsed in any of the following variations:
|
||||
|
||||
```
|
||||
Hello\nworld
|
||||
|
||||
Hello\
|
||||
world
|
||||
```
|
||||
|
||||
The second variant is supported because MySQL uses it when writing tab-separated dumps.
|
||||
|
||||
The minimum set of characters that you need to escape when passing data in TabSeparated format: tab, line feed (LF) and backslash.
|
||||
|
||||
Only a small set of symbols are escaped. You can easily stumble onto a string value that your terminal will ruin in output.
|
||||
|
||||
Arrays are written as a list of comma-separated values in square brackets. Number items in the array are fomratted as normally, but dates, dates with times, and strings are written in single quotes with the same escaping rules as above.
|
||||
|
||||
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
|
||||
|
||||
<a name="tabseparatedraw"></a>
|
||||
|
||||
## TabSeparatedRaw
|
||||
|
||||
Differs from `TabSeparated` format in that the rows are written without escaping.
|
||||
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
|
||||
|
||||
This format is also available under the name `TSVRaw`.
|
||||
<a name="tabseparatedwithnames"></a>
|
||||
|
||||
## TabSeparatedWithNames
|
||||
|
||||
Differs from the `TabSeparated` format in that the column names are written in the first row.
|
||||
During parsing, the first row is completely ignored. You can't use column names to determine their position or to check their correctness.
|
||||
(Support for parsing the header row may be added in the future.)
|
||||
|
||||
This format is also available under the name `TSVWithNames`.
|
||||
<a name="tabseparatedwithnamesandtypes"></a>
|
||||
|
||||
## TabSeparatedWithNamesAndTypes
|
||||
|
||||
Differs from the `TabSeparated` format in that the column names are written to the first row, while the column types are in the second row.
|
||||
During parsing, the first and second rows are completely ignored.
|
||||
|
||||
This format is also available under the name `TSVWithNamesAndTypes`.
|
||||
<a name="tskv"></a>
|
||||
|
||||
## TSKV
|
||||
|
||||
Similar to TabSeparated, but outputs a value in name=value format. Names are escaped the same way as in TabSeparated format, and the = symbol is also escaped.
|
||||
|
||||
```
|
||||
SearchPhrase= count()=8267016
|
||||
SearchPhrase=bathroom interior design count()=2166
|
||||
SearchPhrase=yandex count()=1655
|
||||
SearchPhrase=2014 spring fashion count()=1549
|
||||
SearchPhrase=freeform photos count()=1480
|
||||
SearchPhrase=angelina jolie count()=1245
|
||||
SearchPhrase=omsk count()=1112
|
||||
SearchPhrase=photos of dog breeds count()=1091
|
||||
SearchPhrase=curtain designs count()=1064
|
||||
SearchPhrase=baku count()=1000
|
||||
```
|
||||
|
||||
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
|
||||
|
||||
``` sql
|
||||
SELECT * FROM t_null FORMAT TSKV
|
||||
```
|
||||
|
||||
```
|
||||
x=1 y=\N
|
||||
```
|
||||
|
||||
When there is a large number of small columns, this format is ineffective, and there is generally no reason to use it. It is used in some departments of Yandex.
|
||||
|
||||
Both data output and parsing are supported in this format. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted – they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults.
|
||||
|
||||
Parsing allows the presence of the additional field `tskv` without the equal sign or a value. This field is ignored.
|
||||
|
||||
Deserialization is effective and usually doesn't increase the system load.
|
||||
<a name="csv"></a>
|
||||
|
||||
## CSV
|
||||
@ -86,7 +186,7 @@ Also prints the header row, similar to `TabSeparatedWithNames`.
|
||||
|
||||
Outputs data in JSON format. Besides data tables, it also outputs column names and types, along with some additional information: the total number of output rows, and the number of rows that could have been output if there weren't a LIMIT. Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT SearchPhrase, count() AS c FROM test.hits GROUP BY SearchPhrase WITH TOTALS ORDER BY c DESC LIMIT 5 FORMAT JSON
|
||||
```
|
||||
|
||||
@ -263,7 +363,7 @@ Each result block is output as a separate table. This is necessary so that block
|
||||
|
||||
[NULL](../query_language/syntax.md#null-literal) is output as `ᴺᵁᴸᴸ`.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT * FROM t_null
|
||||
```
|
||||
|
||||
@ -278,11 +378,11 @@ This format is only appropriate for outputting a query result, but not for parsi
|
||||
|
||||
The Pretty format supports outputting total values (when using WITH TOTALS) and extremes (when 'extremes' is set to 1). In these cases, total values and extreme values are output after the main data, in separate tables. Example (shown for the PrettyCompact format):
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT PrettyCompact
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
┌──EventDate─┬───────c─┐
|
||||
│ 2014-03-17 │ 1406958 │
|
||||
│ 2014-03-18 │ 1383658 │
|
||||
@ -359,131 +459,6 @@ Array is represented as a varint length (unsigned [LEB128](https://en.wikipedia.
|
||||
|
||||
For [NULL](../query_language/syntax.md#null-literal) support, an additional byte containing 1 or 0 is added before each [Nullable](../data_types/nullable.md#data_type-nullable) value. If 1, then the value is `NULL` and this byte is interpreted as a separate value. If 0, the value after the byte is not `NULL`.
|
||||
|
||||
<a name="tabseparated"></a>
|
||||
|
||||
## TabSeparated
|
||||
|
||||
In TabSeparated format, data is written by row. Each row contains values separated by tabs. Each value is follow by a tab, except the last value in the row, which is followed by a line feed. Strictly Unix line feeds are assumed everywhere. The last row also must contain a line feed at the end. Values are written in text format, without enclosing quotation marks, and with special characters escaped.
|
||||
|
||||
This format is also available under the name `TSV`.
|
||||
|
||||
The `TabSeparated` format is convenient for processing data using custom programs and scripts. It is used by default in the HTTP interface, and in the command-line client's batch mode. This format also allows transferring data between different DBMSs. For example, you can get a dump from MySQL and upload it to ClickHouse, or vice versa.
|
||||
|
||||
The `TabSeparated` format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example:
|
||||
|
||||
```sql
|
||||
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
|
||||
```
|
||||
|
||||
```text
|
||||
2014-03-17 1406958
|
||||
2014-03-18 1383658
|
||||
2014-03-19 1405797
|
||||
2014-03-20 1353623
|
||||
2014-03-21 1245779
|
||||
2014-03-22 1031592
|
||||
2014-03-23 1046491
|
||||
|
||||
0000-00-00 8873898
|
||||
|
||||
2014-03-17 1031592
|
||||
2014-03-23 1406958
|
||||
```
|
||||
|
||||
## Data formatting
|
||||
|
||||
Integer numbers are written in decimal form. Numbers can contain an extra "+" character at the beginning (ignored when parsing, and not recorded when formatting). Non-negative numbers can't contain the negative sign. When reading, it is allowed to parse an empty string as a zero, or (for signed types) a string consisting of just a minus sign as a zero. Numbers that do not fit into the corresponding data type may be parsed as a different number, without an error message.
|
||||
|
||||
Floating-point numbers are written in decimal form. The dot is used as the decimal separator. Exponential entries are supported, as are 'inf', '+inf', '-inf', and 'nan'. An entry of floating-point numbers may begin or end with a decimal point.
|
||||
During formatting, accuracy may be lost on floating-point numbers.
|
||||
During parsing, it is not strictly required to read the nearest machine-representable number.
|
||||
|
||||
Dates are written in YYYY-MM-DD format and parsed in the same format, but with any characters as separators.
|
||||
Dates with times are written in the format YYYY-MM-DD hh:mm:ss and parsed in the same format, but with any characters as separators.
|
||||
This all occurs in the system time zone at the time the client or server starts (depending on which one formats data). For dates with times, daylight saving time is not specified. So if a dump has times during daylight saving time, the dump does not unequivocally match the data, and parsing will select one of the two times.
|
||||
During a read operation, incorrect dates and dates with times can be parsed with natural overflow or as null dates and times, without an error message.
|
||||
|
||||
As an exception, parsing dates with times is also supported in Unix timestamp format, if it consists of exactly 10 decimal digits. The result is not time zone-dependent. The formats YYYY-MM-DD hh:mm:ss and NNNNNNNNNN are differentiated automatically.
|
||||
|
||||
Strings are output with backslash-escaped special characters. The following escape sequences are used for output: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\'`, `\\`. Parsing also supports the sequences `\a`, `\v`, and `\xHH` (hex escape sequences) and any `\c` sequences, where `c` is any character (these sequences are converted to `c`). Thus, reading data supports formats where a line feed can be written as `\n` or `\`, or as a line feed. For example, the string `Hello world` with a line feed between the words instead of a space can be parsed in any of the following variations:
|
||||
|
||||
```text
|
||||
Hello\nworld
|
||||
|
||||
Hello\
|
||||
world
|
||||
```
|
||||
|
||||
The second variant is supported because MySQL uses it when writing tab-separated dumps.
|
||||
|
||||
The minimum set of characters that you need to escape when passing data in TabSeparated format: tab, line feed (LF) and backslash.
|
||||
|
||||
Only a small set of symbols are escaped. You can easily stumble onto a string value that your terminal will ruin in output.
|
||||
|
||||
Arrays are written as a list of comma-separated values in square brackets. Number items in the array are fomratted as normally, but dates, dates with times, and strings are written in single quotes with the same escaping rules as above.
|
||||
|
||||
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
|
||||
|
||||
<a name="tabseparatedraw"></a>
|
||||
|
||||
## TabSeparatedRaw
|
||||
|
||||
Differs from `TabSeparated` format in that the rows are written without escaping.
|
||||
This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
|
||||
|
||||
This format is also available under the name `TSVRaw`.
|
||||
<a name="tabseparatedwithnames"></a>
|
||||
|
||||
## TabSeparatedWithNames
|
||||
|
||||
Differs from the `TabSeparated` format in that the column names are written in the first row.
|
||||
During parsing, the first row is completely ignored. You can't use column names to determine their position or to check their correctness.
|
||||
(Support for parsing the header row may be added in the future.)
|
||||
|
||||
This format is also available under the name `TSVWithNames`.
|
||||
<a name="tabseparatedwithnamesandtypes"></a>
|
||||
|
||||
## TabSeparatedWithNamesAndTypes
|
||||
|
||||
Differs from the `TabSeparated` format in that the column names are written to the first row, while the column types are in the second row.
|
||||
During parsing, the first and second rows are completely ignored.
|
||||
|
||||
This format is also available under the name `TSVWithNamesAndTypes`.
|
||||
<a name="tskv"></a>
|
||||
|
||||
## TSKV
|
||||
|
||||
Similar to TabSeparated, but outputs a value in name=value format. Names are escaped the same way as in TabSeparated format, and the = symbol is also escaped.
|
||||
|
||||
```text
|
||||
SearchPhrase= count()=8267016
|
||||
SearchPhrase=bathroom interior design count()=2166
|
||||
SearchPhrase=yandex count()=1655
|
||||
SearchPhrase=2014 spring fashion count()=1549
|
||||
SearchPhrase=freeform photos count()=1480
|
||||
SearchPhrase=angelina jolie count()=1245
|
||||
SearchPhrase=omsk count()=1112
|
||||
SearchPhrase=photos of dog breeds count()=1091
|
||||
SearchPhrase=curtain designs count()=1064
|
||||
SearchPhrase=baku count()=1000
|
||||
```
|
||||
|
||||
[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
|
||||
|
||||
```sql
|
||||
SELECT * FROM t_null FORMAT TSKV
|
||||
```
|
||||
|
||||
```
|
||||
x=1 y=\N
|
||||
```
|
||||
|
||||
When there is a large number of small columns, this format is ineffective, and there is generally no reason to use it. It is used in some departments of Yandex.
|
||||
|
||||
Both data output and parsing are supported in this format. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted – they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults.
|
||||
|
||||
Parsing allows the presence of the additional field `tskv` without the equal sign or a value. This field is ignored.
|
||||
|
||||
## Values
|
||||
|
||||
Prints every row in brackets. Rows are separated by commas. There is no comma after the last row. The values inside the brackets are also comma-separated. Numbers are output in decimal format without quotes. Arrays are output in square brackets. Strings, dates, and dates with times are output in quotes. Escaping rules and parsing are similar to the [TabSeparated](#tabseparated) format. During formatting, extra spaces aren't inserted, but during parsing, they are allowed and skipped (except for spaces inside array values, which are not allowed). [NULL](../query_language/syntax.md#null-literal) is represented as `NULL`.
|
||||
@ -502,7 +477,7 @@ Prints each value on a separate line with the column name specified. This format
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT * FROM t_null FORMAT Vertical
|
||||
```
|
||||
|
||||
@ -620,3 +595,31 @@ Just as for JSON, invalid UTF-8 sequences are changed to the replacement charact
|
||||
In string values, the characters `<` and `&` are escaped as `<` and `&`.
|
||||
|
||||
Arrays are output as `<array><elem>Hello</elem><elem>World</elem>...</array>`,and tuples as `<tuple><elem>Hello</elem><elem>World</elem>...</tuple>`.
|
||||
|
||||
<a name="format_capnproto"></a>
|
||||
|
||||
## CapnProto
|
||||
|
||||
Cap'n Proto is a binary message format similar to Protocol Buffers and Thrift, but not like JSON or MessagePack.
|
||||
|
||||
Cap'n Proto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query.
|
||||
|
||||
``` sql
|
||||
SELECT SearchPhrase, count() AS c FROM test.hits
|
||||
GROUP BY SearchPhrase FORMAT CapnProto SETTINGS schema = 'schema:Message'
|
||||
```
|
||||
|
||||
Where `schema.capnp` looks like this:
|
||||
|
||||
```
|
||||
struct Message {
|
||||
SearchPhrase @0 :Text;
|
||||
c @1 :Uint64;
|
||||
}
|
||||
```
|
||||
|
||||
Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md#server_settings-format_schema_path) in the server configuration.
|
||||
|
||||
Deserialization is effective and usually doesn't increase the system load.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/formats/) <!--hide-->
|
||||
|
@ -218,3 +218,5 @@ curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wa
|
||||
|
||||
Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client side, the error can only be detected at the parsing stage.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/http_interface/) <!--hide-->
|
||||
|
@ -4,3 +4,5 @@
|
||||
|
||||
To explore the system's capabilities, download data to tables, or make manual queries, use the clickhouse-client program.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/) <!--hide-->
|
||||
|
@ -3,3 +3,5 @@
|
||||
- [Official driver](https://github.com/yandex/clickhouse-jdbc).
|
||||
- Third-party driver from [ClickHouse-Native-JDBC](https://github.com/housepower/ClickHouse-Native-JDBC).
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/jdbc/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
The native interface is used in the "clickhouse-client" command-line client for interaction between servers with distributed query processing, and also in C++ programs. We will only cover the command-line client.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/tcp/) <!--hide-->
|
||||
|
@ -48,3 +48,5 @@ We have not tested the libraries listed below.
|
||||
- Nim
|
||||
- [nim-clickhouse](https://github.com/leonardoce/nim-clickhouse)
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party_client_libraries/) <!--hide-->
|
||||
|
@ -45,3 +45,5 @@ Key features:
|
||||
- Query development with syntax highlight.
|
||||
- Table preview.
|
||||
- Autocompletion.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party_gui/) <!--hide-->
|
||||
|
@ -60,3 +60,5 @@ ClickHouse provides various ways to trade accuracy for performance:
|
||||
Uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the remaining replicas in the background. The system maintains identical data on different replicas. Recovery after most failures is performed automatically, and in complex cases — semi-automatically.
|
||||
|
||||
For more information, see the section [Data replication](../operations/table_engines/replication.md#table_engines-replication).
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/introduction/distinctive_features/) <!--hide-->
|
||||
|
@ -3,3 +3,5 @@
|
||||
1. No full-fledged transactions.
|
||||
2. Lack of ability to modify or delete already inserted data with high rate and low latency. There are batch deletes and updates available to clean up or modify data, for example to comply with [GDPR](https://gdpr-info.eu).
|
||||
3. The sparse index makes ClickHouse not really suitable for point queries retrieving single rows by their keys.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/introduction/features_considered_disadvantages/) <!--hide-->
|
||||
|
@ -21,3 +21,5 @@ Under the same conditions, ClickHouse can handle several hundred queries per sec
|
||||
## Performance When Inserting Data
|
||||
|
||||
We recommend inserting data in packets of at least 1000 rows, or no more than a single request per second. When inserting to a MergeTree table from a tab-separated dump, the insertion speed will be from 50 to 200 MB/s. If the inserted rows are around 1 Kb in size, the speed will be from 50,000 to 200,000 rows per second. If the rows are small, the performance will be higher in rows per second (on Banner System data -`>` 500,000 rows per second; on Graphite data -`>` 1,000,000 rows per second). To improve performance, you can make multiple INSERT queries in parallel, and performance will increase linearly.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/introduction/performance/) <!--hide-->
|
||||
|
@ -46,3 +46,5 @@ OLAPServer worked well for non-aggregated data, but it had many restrictions tha
|
||||
|
||||
To remove the limitations of OLAPServer and solve the problem of working with non-aggregated data for all reports, we developed the ClickHouse DBMS.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/introduction/ya_metrika_task/) <!--hide-->
|
||||
|
@ -99,3 +99,5 @@ The user can get a list of all databases and tables in them by using `SHOW` quer
|
||||
|
||||
Database access is not related to the [readonly](settings/query_complexity.md#query_complexity_readonly) setting. You can't grant full access to one database and `readonly` access to another one.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/access_rights/) <!--hide-->
|
||||
|
@ -40,3 +40,5 @@ $ cat /etc/clickhouse-server/users.d/alice.xml
|
||||
For each config file, the server also generates `file-preprocessed.xml` files when starting. These files contain all the completed substitutions and overrides, and they are intended for informational use. If ZooKeeper substitutions were used in the config files but ZooKeeper is not available on the server start, the server loads the configuration from the preprocessed file.
|
||||
|
||||
The server tracks changes in config files, as well as files and ZooKeeper nodes that were used when performing substitutions and overrides, and reloads the settings for users and clusters on the fly. This means that you can modify the cluster, users, and their settings without restarting the server.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/configuration_files/) <!--hide-->
|
||||
|
@ -1,2 +1,4 @@
|
||||
# Operations
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/) <!--hide-->
|
||||
|
@ -104,3 +104,5 @@ For distributed query processing, the accumulated amounts are stored on the requ
|
||||
|
||||
When the server is restarted, quotas are reset.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/quotas/) <!--hide-->
|
||||
|
@ -10,3 +10,5 @@ Other settings are described in the "[Settings](../settings/index.md#settings)"
|
||||
|
||||
Before studying the settings, read the [Configuration files](../configuration_files.md#configuration_files) section and note the use of substitutions (the `incl` and `optional` attributes).
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/server_settings/) <!--hide-->
|
||||
|
@ -717,3 +717,5 @@ For more information, see the section "[Replication](../../operations/table_engi
|
||||
<zookeeper incl="zookeeper-servers" optional="true" />
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/server_settings/settings/) <!--hide-->
|
||||
|
@ -22,3 +22,5 @@ Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you
|
||||
|
||||
Settings that can only be made in the server config file are not covered in this section.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/settings/) <!--hide-->
|
||||
|
@ -193,3 +193,5 @@ Maximum number of bytes (uncompressed data) that can be passed to a remote serve
|
||||
## transfer_overflow_mode
|
||||
|
||||
What to do when the amount of data exceeds one of the limits: 'throw' or 'break'. By default, throw.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/settings/query_complexity/) <!--hide-->
|
||||
|
@ -417,3 +417,5 @@ See also the following parameters:
|
||||
- [insert_quorum](#setting-insert_quorum)
|
||||
- [insert_quorum_timeout](#setting-insert_quorum_timeout)
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/settings/settings/) <!--hide-->
|
||||
|
@ -9,7 +9,7 @@ Example:
|
||||
|
||||
Install the `web` profile.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SET profile = 'web'
|
||||
```
|
||||
|
||||
@ -63,3 +63,5 @@ The example specifies two profiles: `default` and `web`. The `default` profile
|
||||
|
||||
Settings profiles can inherit from each other. To use inheritance, indicate the `profile` setting before the other settings that are listed in the profile.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/settings/settings_profiles/) <!--hide-->
|
||||
|
@ -18,7 +18,7 @@ Example: The number of SELECT queries currently running; the amount of memory in
|
||||
Contains information about clusters available in the config file and the servers in them.
|
||||
Columns:
|
||||
|
||||
```text
|
||||
```
|
||||
cluster String — The cluster name.
|
||||
shard_num UInt32 — The shard number in the cluster, starting from 1.
|
||||
shard_weight UInt32 — The relative weight of the shard when writing data.
|
||||
@ -34,7 +34,7 @@ user String — The name of the user for connecting to the server.
|
||||
Contains information about the columns in all tables.
|
||||
You can use this table to get information similar to `DESCRIBE TABLE`, but for multiple tables at once.
|
||||
|
||||
```text
|
||||
```
|
||||
database String — The name of the database the table is in.
|
||||
table String – Table name.
|
||||
name String — Column name.
|
||||
@ -183,7 +183,7 @@ Formats:
|
||||
This system table is used for implementing the `SHOW PROCESSLIST` query.
|
||||
Columns:
|
||||
|
||||
```text
|
||||
```
|
||||
user String – Name of the user who made the request. For distributed query processing, this is the user who helped the requestor server send the query to this server, not the user who made the distributed request on the requestor server.
|
||||
|
||||
address String - The IP address the request was made from. The same for distributed processing.
|
||||
@ -210,14 +210,14 @@ This table can be used for monitoring. The table contains a row for every Replic
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT *
|
||||
FROM system.replicas
|
||||
WHERE table = 'visits'
|
||||
FORMAT Vertical
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
Row 1:
|
||||
──────
|
||||
database: merge
|
||||
@ -243,7 +243,7 @@ active_replicas: 2
|
||||
|
||||
Columns:
|
||||
|
||||
```text
|
||||
```
|
||||
database: Database name
|
||||
table: Table name
|
||||
engine: Table engine name
|
||||
@ -296,7 +296,7 @@ If you don't request the last 4 columns (log_max_index, log_pointer, total_repli
|
||||
|
||||
For example, you can check that everything is working correctly like this:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT
|
||||
database,
|
||||
table,
|
||||
@ -335,7 +335,7 @@ I.e. used for executing the query you are using to read from the system.settings
|
||||
|
||||
Columns:
|
||||
|
||||
```text
|
||||
```
|
||||
name String — Setting name.
|
||||
value String — Setting value.
|
||||
changed UInt8 — Whether the setting was explicitly defined in the config or explicitly changed.
|
||||
@ -343,13 +343,13 @@ changed UInt8 — Whether the setting was explicitly defined in the config or ex
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT *
|
||||
FROM system.settings
|
||||
WHERE changed
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
┌─name───────────────────┬─value───────┬─changed─┐
|
||||
│ max_threads │ 8 │ 1 │
|
||||
│ use_uncompressed_cache │ 0 │ 1 │
|
||||
@ -393,14 +393,14 @@ Columns:
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT *
|
||||
FROM system.zookeeper
|
||||
WHERE path = '/clickhouse/tables/01-08/visits/replicas'
|
||||
FORMAT Vertical
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
Row 1:
|
||||
──────
|
||||
name: example01-08-1.yandex.ru
|
||||
@ -435,3 +435,5 @@ numChildren: 7
|
||||
pzxid: 987021252247
|
||||
path: /clickhouse/tables/01-08/visits/replicas
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/system_tables/) <!--hide-->
|
||||
|
@ -8,7 +8,7 @@ There is an `AggregateFunction` data type. It is a parametric data type. As para
|
||||
|
||||
Examples:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE t
|
||||
(
|
||||
column1 AggregateFunction(uniq, UInt64),
|
||||
@ -33,7 +33,7 @@ Example: `uniqMerge(UserIDState)`, where `UserIDState` has the `AggregateFunctio
|
||||
In other words, an aggregate function with the 'Merge' suffix takes a set of states, combines them, and returns the result.
|
||||
As an example, these two queries return the same result:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT uniq(UserID) FROM table
|
||||
|
||||
SELECT uniqMerge(state) FROM (SELECT uniqState(UserID) AS state FROM table GROUP BY RegionID)
|
||||
@ -51,7 +51,7 @@ Example:
|
||||
|
||||
Create an `AggregatingMergeTree` materialized view that watches the `test.visits` table:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE MATERIALIZED VIEW test.basic
|
||||
ENGINE = AggregatingMergeTree(StartDate, (CounterID, StartDate), 8192)
|
||||
AS SELECT
|
||||
@ -65,13 +65,13 @@ GROUP BY CounterID, StartDate;
|
||||
|
||||
Insert data in the `test.visits` table. Data will also be inserted in the view, where it will be aggregated:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
INSERT INTO test.visits ...
|
||||
```
|
||||
|
||||
Perform `SELECT` from the view using `GROUP BY` in order to complete data aggregation:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT
|
||||
StartDate,
|
||||
sumMerge(Visits) AS Visits,
|
||||
@ -85,3 +85,5 @@ You can create a materialized view like this and assign a normal view to it that
|
||||
|
||||
Note that in most cases, using `AggregatingMergeTree` is not justified, since queries can be run efficiently enough on non-aggregated data.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/aggregatingmergetree/) <!--hide-->
|
||||
|
@ -2,7 +2,7 @@
|
||||
|
||||
Buffers the data to write in RAM, periodically flushing it to another table. During the read operation, data is read from the buffer and the other table simultaneously.
|
||||
|
||||
```text
|
||||
```
|
||||
Buffer(database, table, num_layers, min_time, max_time, min_rows, max_rows, min_bytes, max_bytes)
|
||||
```
|
||||
|
||||
@ -16,7 +16,7 @@ The conditions for flushing the data are calculated separately for each of the '
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE merge.hits_buffer AS merge.hits ENGINE = Buffer(merge, hits, 16, 10, 100, 10000, 1000000, 10000000, 100000000)
|
||||
```
|
||||
|
||||
@ -52,3 +52,5 @@ A Buffer table is used when too many INSERTs are received from a large number of
|
||||
|
||||
Note that it doesn't make sense to insert data one row at a time, even for Buffer tables. This will only produce a speed of a few thousand rows per second, while inserting larger blocks of data can produce over a million rows per second (see the section "Performance").
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/buffer/) <!--hide-->
|
||||
|
@ -14,7 +14,7 @@ This is the main concept that allows Yandex.Metrica to work in real time.
|
||||
|
||||
CollapsingMergeTree accepts an additional parameter - the name of an Int8-type column that contains the row's "sign". Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CollapsingMergeTree(EventDate, (CounterID, EventDate, intHash32(UniqID), VisitID), 8192, Sign)
|
||||
```
|
||||
|
||||
@ -36,3 +36,5 @@ There are several ways to get completely "collapsed" data from a `CollapsingMerg
|
||||
1. Write a query with GROUP BY and aggregate functions that accounts for the sign. For example, to calculate quantity, write 'sum(Sign)' instead of 'count()'. To calculate the sum of something, write 'sum(Sign * x)' instead of 'sum(x)', and so on, and also add 'HAVING sum(Sign) `>` 0'. Not all amounts can be calculated this way. For example, the aggregate functions 'min' and 'max' can't be rewritten.
|
||||
2. If you must extract data without aggregation (for example, to check whether rows are present whose newest values match certain conditions), you can use the FINAL modifier for the FROM clause. This approach is significantly less efficient.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/collapsingmergetree/) <!--hide-->
|
||||
|
@ -12,7 +12,7 @@ ENGINE [=] Name(...) [PARTITION BY expr] [ORDER BY expr] [SAMPLE BY expr] [SETTI
|
||||
|
||||
For MergeTree tables, the partition expression is specified after `PARTITION BY`, the primary key after `ORDER BY`, the sampling key after `SAMPLE BY`, and `SETTINGS` can specify `index_granularity` (optional; the default value is 8192), as well as other settings from [MergeTreeSettings.h](https://github.com/yandex/ClickHouse/blob/master/dbms/src/Storages/MergeTree/MergeTreeSettings.h). The other engine parameters are specified in parentheses after the engine name, as previously. Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ENGINE = ReplicatedCollapsingMergeTree('/clickhouse/tables/name', 'replica1', Sign)
|
||||
PARTITION BY (toMonday(StartDate), EventType)
|
||||
ORDER BY (CounterID, StartDate, intHash32(UserID))
|
||||
@ -27,7 +27,7 @@ After this table is created, merge will only work for data parts that have the s
|
||||
|
||||
To specify a partition in ALTER PARTITION commands, specify the value of the partition expression (or a tuple). Constants and constant expressions are supported. Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE table DROP PARTITION (toMonday(today()), 1)
|
||||
```
|
||||
|
||||
@ -45,3 +45,5 @@ The partition ID is its string identifier (human-readable, if possible) that is
|
||||
|
||||
For more examples, see the tests [`00502_custom_partitioning_local`](https://github.com/yandex/ClickHouse/blob/master/dbms/tests/queries/0_stateless/00502_custom_partitioning_local.sql) and [`00502_custom_partitioning_replicated_zookeeper`](https://github.com/yandex/ClickHouse/blob/master/dbms/tests/queries/0_stateless/00502_custom_partitioning_replicated_zookeeper.sql).
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/custom_partitioning_key/) <!--hide-->
|
||||
|
@ -39,7 +39,7 @@ As an example, consider a dictionary of `products` with the following configurat
|
||||
|
||||
Query the dictionary data:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
select name, type, key, attribute.names, attribute.types, bytes_allocated, element_count,source from system.dictionaries where name = 'products';
|
||||
|
||||
SELECT
|
||||
@ -73,7 +73,7 @@ CREATE TABLE %table_name% (%fields%) engine = Dictionary(%dictionary_name%)`
|
||||
|
||||
Usage example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
create table products (product_id UInt64, title String) Engine = Dictionary(products);
|
||||
|
||||
CREATE TABLE products
|
||||
@ -92,7 +92,7 @@ Ok.
|
||||
|
||||
Take a look at what's in the table.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
select * from products limit 1;
|
||||
|
||||
SELECT *
|
||||
@ -108,3 +108,5 @@ LIMIT 1
|
||||
1 rows in set. Elapsed: 0.006 sec.
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/dictionary/) <!--hide-->
|
||||
|
@ -7,7 +7,7 @@ Reading is automatically parallelized. During a read, the table indexes on remot
|
||||
The Distributed engine accepts parameters: the cluster name in the server's config file, the name of a remote database, the name of a remote table, and (optionally) a sharding key.
|
||||
Example:
|
||||
|
||||
```text
|
||||
```
|
||||
Distributed(logs, default, hits[, sharding_key])
|
||||
```
|
||||
|
||||
@ -122,3 +122,5 @@ If the server ceased to exist or had a rough restart (for example, after a devic
|
||||
|
||||
When the max_parallel_replicas option is enabled, query processing is parallelized across all replicas within a single shard. For more information, see the section "Settings, max_parallel_replicas".
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/distributed/) <!--hide-->
|
||||
|
@ -60,3 +60,5 @@ curl -F 'passwd=@passwd.tsv;' 'http://localhost:8123/?query=SELECT+shell,+count(
|
||||
|
||||
For distributed query processing, the temporary tables are sent to all the remote servers.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/external_data/) <!--hide-->
|
||||
|
@ -31,7 +31,7 @@ You may manually create this subfolder and file in server filesystem and then [A
|
||||
|
||||
**1.** Set up the `file_engine_table` table:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE file_engine_table (name String, value UInt32) ENGINE=File(TabSeparated)
|
||||
```
|
||||
|
||||
@ -47,11 +47,11 @@ two 2
|
||||
|
||||
**3.** Query the data:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT * FROM file_engine_table
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
┌─name─┬─value─┐
|
||||
│ one │ 1 │
|
||||
│ two │ 2 │
|
||||
@ -76,3 +76,5 @@ $ echo -e "1,2\n3,4" | clickhouse-local -q "CREATE TABLE table (a Int64, b Int64
|
||||
- `SELECT ... SAMPLE`
|
||||
- Indices
|
||||
- Replication
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/file/) <!--hide-->
|
||||
|
@ -27,7 +27,7 @@ The Graphite data table must contain the following fields at minimum:
|
||||
|
||||
Rollup pattern:
|
||||
|
||||
```text
|
||||
```
|
||||
pattern
|
||||
regexp
|
||||
function
|
||||
@ -84,3 +84,5 @@ Example of settings:
|
||||
</graphite_rollup>
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/graphitemergetree/) <!--hide-->
|
||||
|
@ -14,3 +14,5 @@ The table engine (type of table) determines:
|
||||
When reading, the engine is only required to output the requested columns, but in some cases the engine can partially process data when responding to the request.
|
||||
|
||||
For most serious tasks, you should use engines from the `MergeTree` family.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/) <!--hide-->
|
||||
|
@ -2,7 +2,7 @@
|
||||
|
||||
A prepared data structure for JOIN that is always located in RAM.
|
||||
|
||||
```text
|
||||
```
|
||||
Join(ANY|ALL, LEFT|INNER, k1[, k2, ...])
|
||||
```
|
||||
|
||||
@ -15,3 +15,5 @@ You can use INSERT to add data to the table, similar to the Set engine. For ANY,
|
||||
|
||||
Storing data on the disk is the same as for the Set engine.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/join/) <!--hide-->
|
||||
|
@ -44,7 +44,7 @@ Optional parameters:
|
||||
|
||||
Examples:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE queue (
|
||||
timestamp UInt64,
|
||||
level String,
|
||||
@ -86,7 +86,7 @@ When the `MATERIALIZED VIEW` joins the engine, it starts collecting data in the
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE queue (
|
||||
timestamp UInt64,
|
||||
level String,
|
||||
@ -136,3 +136,5 @@ Similar to GraphiteMergeTree, the Kafka engine supports extended configuration u
|
||||
```
|
||||
|
||||
For a list of possible configuration options, see the [librdkafka configuration reference](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md). Use the underscore (`_`) instead of a dot in the ClickHouse configuration. For example, `check.crcs=true` will be `<check_crcs>true</check_crcs>`.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/kafka/) <!--hide-->
|
||||
|
@ -4,3 +4,5 @@ Log differs from TinyLog in that a small file of "marks" resides with the column
|
||||
For concurrent data access, the read operations can be performed simultaneously, while write operations block reads and each other.
|
||||
The Log engine does not support indexes. Similarly, if writing to a table failed, the table is broken, and reading from it returns an error. The Log engine is appropriate for temporary data, write-once tables, and for testing or demonstration purposes.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/log/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
Used for implementing materialized views (for more information, see [CREATE TABLE](../../query_language/create.md#query_language-queries-create_table)). For storing data, it uses a different engine that was specified when creating the view. When reading from a table, it just uses this engine.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/materializedview/) <!--hide-->
|
||||
|
@ -9,3 +9,5 @@ Normally, using this table engine is not justified. However, it can be used for
|
||||
|
||||
The Memory engine is used by the system for temporary tables with external query data (see the section "External data for processing a query"), and for implementing GLOBAL IN (see the section "IN operators").
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/memory/) <!--hide-->
|
||||
|
@ -65,3 +65,5 @@ The `Merge` type table contains a virtual `_table` column of the `String` type.
|
||||
|
||||
If the `WHERE/PREWHERE` clause contains conditions for the `_table` column that do not depend on other table columns (as one of the conjunction elements, or as an entire expression), these conditions are used as an index. The conditions are performed on a data set of table names to read data from, and the read operation will be performed from only those tables that the condition was triggered on.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/merge/) <!--hide-->
|
||||
|
@ -156,7 +156,7 @@ ENGINE MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDa
|
||||
|
||||
In this case, in queries:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT count() FROM table WHERE EventDate = toDate(now()) AND CounterID = 34
|
||||
SELECT count() FROM table WHERE EventDate = toDate(now()) AND (CounterID = 34 OR CounterID = 42)
|
||||
SELECT count() FROM table WHERE ((EventDate >= toDate('2014-01-01') AND EventDate <= toDate('2014-01-31')) OR EventDate = toDate('2014-05-01')) AND CounterID IN (101500, 731962, 160656) AND (CounterID = 101500 OR EventDate != toDate('2014-05-01'))
|
||||
@ -168,7 +168,7 @@ The queries above show that the index is used even for complex expressions. Read
|
||||
|
||||
In the example below, the index can't be used.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT count() FROM table WHERE CounterID = 34 OR URL LIKE '%upyachka%'
|
||||
```
|
||||
|
||||
@ -182,3 +182,5 @@ For concurrent table access, we use multi-versioning. In other words, when a tab
|
||||
|
||||
Reading from a table is automatically parallelized.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/mergetree/) <!--hide-->
|
||||
|
@ -26,3 +26,5 @@ The rest of the conditions and the `LIMIT` sampling constraint are executed in C
|
||||
|
||||
The `MySQL` engine does not support the [Nullable](../../data_types/nullable.md#data_type-nullable) data type, so when reading data from MySQL tables, `NULL` is converted to default values for the specified column type (usually 0 or an empty string).
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/mysql/) <!--hide-->
|
||||
|
@ -4,3 +4,5 @@ When writing to a Null table, data is ignored. When reading from a Null table, t
|
||||
|
||||
However, you can create a materialized view on a Null table. So the data written to the table will end up in the view.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/null/) <!--hide-->
|
||||
|
@ -6,7 +6,7 @@ The last optional parameter for the table engine is the version column. When mer
|
||||
|
||||
The version column must have a type from the `UInt` family, `Date`, or `DateTime`.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ReplacingMergeTree(EventDate, (OrderID, EventDate, BannerID, ...), 8192, ver)
|
||||
```
|
||||
|
||||
@ -16,3 +16,5 @@ Thus, `ReplacingMergeTree` is suitable for clearing out duplicate data in the b
|
||||
|
||||
*This engine is not used in Yandex.Metrica, but it has been applied in other Yandex projects.*
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/replacingmergetree/) <!--hide-->
|
||||
|
@ -78,7 +78,7 @@ Two parameters are also added in the beginning of the parameters list – the pa
|
||||
|
||||
Example:
|
||||
|
||||
```text
|
||||
```
|
||||
ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/hits', '{replica}', EventDate, intHash32(UserID), (CounterID, EventDate, intHash32(UserID), EventTime), 8192)
|
||||
```
|
||||
|
||||
@ -180,3 +180,5 @@ After this, you can launch the server, create a `MergeTree` table, move the data
|
||||
## Recovery When Metadata in The ZooKeeper Cluster is Lost or Damaged
|
||||
|
||||
If the data in ZooKeeper was lost or damaged, you can save data by moving it to an unreplicated table as described above.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/replication/) <!--hide-->
|
||||
|
@ -9,3 +9,5 @@ Data is always located in RAM. For INSERT, the blocks of inserted data are also
|
||||
|
||||
For a rough server restart, the block of data on the disk might be lost or damaged. In the latter case, you may need to manually delete the file with damaged data.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/set/) <!--hide-->
|
||||
|
@ -4,13 +4,13 @@
|
||||
|
||||
This engine differs from `MergeTree` in that it totals data while merging.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SummingMergeTree(EventDate, (OrderID, EventDate, BannerID, ...), 8192)
|
||||
```
|
||||
|
||||
The columns to total are implicit. When merging, all rows with the same primary key value (in the example, OrderId, EventDate, BannerID, ...) have their values totaled in numeric columns that are not part of the primary key.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SummingMergeTree(EventDate, (OrderID, EventDate, BannerID, ...), 8192, (Shows, Clicks, Cost, ...))
|
||||
```
|
||||
|
||||
@ -32,7 +32,7 @@ Then this nested table is interpreted as a mapping of key `=>` (values...), and
|
||||
|
||||
Examples:
|
||||
|
||||
```text
|
||||
```
|
||||
[(1, 100)] + [(2, 150)] -> [(1, 100), (2, 150)]
|
||||
[(1, 100)] + [(1, 150)] -> [(1, 250)]
|
||||
[(1, 100)] + [(1, 150), (2, 150)] -> [(1, 250), (2, 150)]
|
||||
@ -45,3 +45,5 @@ For nested data structures, you don't need to specify the columns as a list of c
|
||||
|
||||
This table engine is not particularly useful. Remember that when saving just pre-aggregated data, you lose some of the system's advantages.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/summingmergetree/) <!--hide-->
|
||||
|
@ -17,3 +17,5 @@ The situation when you have a large number of small tables guarantees poor produ
|
||||
|
||||
In Yandex.Metrica, TinyLog tables are used for intermediary data that is processed in small batches.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/tinylog/) <!--hide-->
|
||||
|
@ -23,7 +23,7 @@ respectively. For processing `POST` requests, the remote server must support
|
||||
|
||||
**1.** Create a `url_engine_table` table on the server :
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE url_engine_table (word String, value UInt64)
|
||||
ENGINE=URL('http://127.0.0.1:12345/', CSV)
|
||||
```
|
||||
@ -53,11 +53,11 @@ python3 server.py
|
||||
|
||||
**3.** Request data:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT * FROM url_engine_table
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
┌─word──┬─value─┐
|
||||
│ Hello │ 1 │
|
||||
│ World │ 2 │
|
||||
@ -71,3 +71,5 @@ SELECT * FROM url_engine_table
|
||||
- `ALTER` and `SELECT...SAMPLE` operations.
|
||||
- Indexes.
|
||||
- Replication.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/url/) <!--hide-->
|
||||
|
@ -2,3 +2,5 @@
|
||||
|
||||
Used for implementing views (for more information, see the `CREATE VIEW query`). It does not store data, but only stores the specified `SELECT` query. When reading from a table, it runs this query (and deletes all unnecessary columns from the query).
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/table_engines/view/) <!--hide-->
|
||||
|
@ -178,7 +178,7 @@ dynamicConfigFile=/etc/zookeeper-{{ cluster['name'] }}/conf/zoo.cfg.dynamic
|
||||
|
||||
Java version:
|
||||
|
||||
```text
|
||||
```
|
||||
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
|
||||
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
|
||||
```
|
||||
@ -226,7 +226,7 @@ JAVA_OPTS="-Xms{{ cluster.get('xms','128M') }} \
|
||||
|
||||
Salt init:
|
||||
|
||||
```text
|
||||
```
|
||||
description "zookeeper-{{ cluster['name'] }} centralized coordination service"
|
||||
|
||||
start on runlevel [2345]
|
||||
@ -255,3 +255,5 @@ script
|
||||
end script
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/tips/) <!--hide-->
|
||||
|
@ -159,3 +159,5 @@ Parameters:
|
||||
|
||||
`clickhouse-copier` tracks the changes in `/task/path/description` and applies them on the fly. For instance, if you change the value of `max_workers`, the number of processes running tasks will also change.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/utils/clickhouse-copier/) <!--hide-->
|
||||
|
@ -71,3 +71,5 @@ Read 186 rows, 4.15 KiB in 0.035 sec., 5302 rows/sec., 118.34 KiB/sec.
|
||||
├──────────┼──────────┤
|
||||
...
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/utils/clickhouse-local/) <!--hide-->
|
||||
|
@ -3,3 +3,5 @@
|
||||
* [clickhouse-local](clickhouse-local.md#utils-clickhouse-local) — Allows running SQL queries on data without stopping the ClickHouse server, similar to how `awk` does this.
|
||||
* [clickhouse-copier](clickhouse-copier.md#utils-clickhouse-copier) — Copies (and reshards) data from one cluster to another cluster.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/operations/utils/) <!--hide-->
|
||||
|
@ -38,3 +38,5 @@ Merges the intermediate aggregation states in the same way as the -Merge combina
|
||||
|
||||
Converts an aggregate function for tables into an aggregate function for arrays that aggregates the corresponding array items and returns an array of results. For example, `sumForEach` for the arrays `[1, 2]`, `[3, 4, 5]`and`[6, 7]`returns the result `[10, 13, 5]` after adding together the corresponding array items.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/combinators/) <!--hide-->
|
||||
|
@ -61,3 +61,5 @@ FROM t_null_big
|
||||
|
||||
`groupArray` does not include `NULL` in the resulting array.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/) <!--hide-->
|
||||
|
@ -23,7 +23,7 @@ Example: `sequenceMatch ('(?1).*(?2)')(EventTime, URL LIKE '%company%', URL LIKE
|
||||
|
||||
This is a singular example. You could write it using other aggregate functions:
|
||||
|
||||
```text
|
||||
```
|
||||
minIf(EventTime, URL LIKE '%company%') < maxIf(EventTime, URL LIKE '%cart%').
|
||||
```
|
||||
|
||||
@ -151,7 +151,9 @@ It works as fast as possible, except for cases when a large N value is used and
|
||||
|
||||
Usage example:
|
||||
|
||||
```text
|
||||
```
|
||||
Problem: Generate a report that shows only keywords that produced at least 5 unique users.
|
||||
Solution: Write in the GROUP BY query SearchPhrase HAVING uniqUpTo(4)(UserID) >= 5
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/parametric_functions/) <!--hide-->
|
||||
|
@ -35,7 +35,7 @@ anyHeavy(column)
|
||||
|
||||
Take the [OnTime](../../getting_started/example_datasets/ontime.md#example_datasets-ontime) data set and select any frequently occurring value in the `AirlineID` column.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT anyHeavy(AirlineID) AS res
|
||||
FROM ontime
|
||||
```
|
||||
@ -101,7 +101,7 @@ Returns a tuple of two arrays: keys in sorted order, and values summed for
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE sum_map(
|
||||
date Date,
|
||||
timeslot DateTime,
|
||||
@ -122,7 +122,7 @@ FROM sum_map
|
||||
GROUP BY timeslot
|
||||
```
|
||||
|
||||
```text
|
||||
```
|
||||
┌────────────timeslot─┬─sumMap(statusMap.status, statusMap.requests)─┐
|
||||
│ 2000-01-01 00:00:00 │ ([1,2,3,4,5],[10,10,20,10,10]) │
|
||||
│ 2000-01-01 00:01:00 │ ([4,5,6,7,8],[10,10,20,10,10]) │
|
||||
@ -325,7 +325,7 @@ We recommend using the `N < 10 ` value; performance is reduced with large `N` va
|
||||
|
||||
Take the [OnTime](../../getting_started/example_datasets/ontime.md#example_datasets-ontime) data set and select the three most frequently occurring values in the `AirlineID` column.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT topK(3)(AirlineID) AS res
|
||||
FROM ontime
|
||||
```
|
||||
@ -350,3 +350,5 @@ Calculates the value of `Σ((x - x̅)(y - y̅)) / n`.
|
||||
|
||||
Calculates the Pearson correlation coefficient: `Σ((x - x̅)(y - y̅)) / sqrt(Σ((x - x̅)^2) * Σ((y - y̅)^2))`.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/agg_functions/reference/) <!--hide-->
|
||||
|
@ -8,7 +8,7 @@ The `ALTER` query is only supported for `*MergeTree` tables, as well as `Merge`a
|
||||
|
||||
Changing the table structure.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db].name [ON CLUSTER cluster] ADD|DROP|MODIFY COLUMN ...
|
||||
```
|
||||
|
||||
@ -17,7 +17,7 @@ Each action is an operation on a column.
|
||||
|
||||
The following actions are supported:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ADD COLUMN name [type] [default_expr] [AFTER name_after]
|
||||
```
|
||||
|
||||
@ -27,14 +27,14 @@ Adding a column just changes the table structure, without performing any actions
|
||||
|
||||
This approach allows us to complete the ALTER query instantly, without increasing the volume of old data.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
DROP COLUMN name
|
||||
```
|
||||
|
||||
Deletes the column with the name 'name'.
|
||||
Deletes data from the file system. Since this deletes entire files, the query is completed almost instantly.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
MODIFY COLUMN name [type] [default_expr]
|
||||
```
|
||||
|
||||
@ -86,7 +86,7 @@ A "part" in the table is part of the data from a single partition, sorted by the
|
||||
|
||||
You can use the `system.parts` table to view the set of table parts and partitions:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT * FROM system.parts WHERE active
|
||||
```
|
||||
|
||||
@ -123,7 +123,7 @@ For replicated tables, the set of parts can't be changed in any case.
|
||||
|
||||
The `detached` directory contains parts that are not used by the server - detached from the table using the `ALTER ... DETACH` query. Parts that are damaged are also moved to this directory, instead of deleting them. You can add, delete, or modify the data in the 'detached' directory at any time – the server won't know about this until you make the `ALTER TABLE ... ATTACH` query.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db.]table DETACH PARTITION 'name'
|
||||
```
|
||||
|
||||
@ -134,13 +134,13 @@ After the query is executed, you can do whatever you want with the data in the '
|
||||
|
||||
The query is replicated – data will be moved to the 'detached' directory and forgotten on all replicas. The query can only be sent to a leader replica. To find out if a replica is a leader, perform SELECT to the 'system.replicas' system table. Alternatively, it is easier to make a query on all replicas, and all except one will throw an exception.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db.]table DROP PARTITION 'name'
|
||||
```
|
||||
|
||||
The same as the `DETACH` operation. Deletes data from the table. Data parts will be tagged as inactive and will be completely deleted in approximately 10 minutes. The query is replicated – data will be deleted on all replicas.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db.]table ATTACH PARTITION|PART 'name'
|
||||
```
|
||||
|
||||
@ -152,7 +152,7 @@ The query is replicated. Each replica checks whether there is data in the 'detac
|
||||
|
||||
So you can put data in the 'detached' directory on one replica, and use the ALTER ... ATTACH query to add it to the table on all replicas.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db.]table FREEZE PARTITION 'name'
|
||||
```
|
||||
|
||||
@ -196,7 +196,7 @@ For protection from device failures, you must use replication. For more informat
|
||||
Backups protect against human error (accidentally deleting data, deleting the wrong data or in the wrong cluster, or corrupting data).
|
||||
For high-volume databases, it can be difficult to copy backups to remote servers. In such cases, to protect from human error, you can keep a backup on the same server (it will reside in `/var/lib/clickhouse/shadow/`).
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db.]table FETCH PARTITION 'name' FROM 'path-in-zookeeper'
|
||||
```
|
||||
|
||||
@ -232,13 +232,13 @@ Existing tables are ready for mutations as-is (no conversion necessary), but aft
|
||||
|
||||
Currently available commands:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db.]table DELETE WHERE filter_expr
|
||||
```
|
||||
|
||||
The `filter_expr` must be of type UInt8. The query deletes rows in the table for which this expression takes a non-zero value.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr
|
||||
```
|
||||
|
||||
@ -272,3 +272,5 @@ The table contains information about mutations of MergeTree tables and their pro
|
||||
|
||||
**is_done** - Is the mutation done? Note that even if `parts_to_do = 0` it is possible that a mutation of a replicated table is not done yet because of a long-running INSERT that will create a new data part that will need to be mutated.
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/alter/) <!--hide-->
|
||||
|
@ -2,7 +2,7 @@
|
||||
|
||||
Creating db_name databases
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE DATABASE [IF NOT EXISTS] db_name
|
||||
```
|
||||
|
||||
@ -15,7 +15,7 @@ If `IF NOT EXISTS` is included, the query won't return an error if the database
|
||||
|
||||
The `CREATE TABLE` query can have several forms.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
|
||||
(
|
||||
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
|
||||
@ -30,13 +30,13 @@ The structure of the table is a list of column descriptions. If indexes are supp
|
||||
A column description is `name type` in the simplest case. Example: `RegionID UInt32`.
|
||||
Expressions can also be defined for default values (see below).
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name AS [db2.]name2 [ENGINE = engine]
|
||||
```
|
||||
|
||||
Creates a table with the same structure as another table. You can specify a different engine for the table. If the engine is not specified, the same engine will be used as for the `db2.name2` table.
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db.]name ENGINE = engine AS SELECT ...
|
||||
```
|
||||
|
||||
@ -97,7 +97,7 @@ Distributed DDL queries (ON CLUSTER clause)
|
||||
The `CREATE`, `DROP`, `ALTER`, and `RENAME` queries support distributed execution on a cluster.
|
||||
For example, the following query creates the `all_hits` `Distributed` table on each host in `cluster`:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE TABLE IF NOT EXISTS all_hits ON CLUSTER cluster (p Date, i Int32) ENGINE = Distributed(cluster, default, hits)
|
||||
```
|
||||
|
||||
@ -107,7 +107,7 @@ The local version of the query will eventually be implemented on each host in th
|
||||
|
||||
## CREATE VIEW
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE [MATERIALIZED] VIEW [IF NOT EXISTS] [db.]name [TO[db.]name] [ENGINE = engine] [POPULATE] AS SELECT ...
|
||||
```
|
||||
|
||||
@ -121,19 +121,19 @@ Normal views don't store any data, but just perform a read from another table. I
|
||||
|
||||
As an example, assume you've created a view:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
CREATE VIEW view AS SELECT ...
|
||||
```
|
||||
|
||||
and written a query:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT a, b, c FROM view
|
||||
```
|
||||
|
||||
This query is fully equivalent to using the subquery:
|
||||
|
||||
```sql
|
||||
``` sql
|
||||
SELECT a, b, c FROM (SELECT ...)
|
||||
```
|
||||
|
||||
@ -152,3 +152,5 @@ The execution of `ALTER` queries on materialized views has not been fully develo
|
||||
Views look the same as normal tables. For example, they are listed in the result of the `SHOW TABLES` query.
|
||||
|
||||
There isn't a separate query for deleting views. To delete a view, use `DROP TABLE`.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/create/) <!--hide-->
|
||||
|
@ -41,3 +41,5 @@ See also "[Functions for working with external dictionaries](../functions/ext_di
|
||||
|
||||
!!! attention
|
||||
You can convert values for a small dictionary by describing it in a `SELECT` query (see the [transform](../functions/other_functions.md#other_functions-transform) function). This functionality is not related to external dictionaries.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts/) <!--hide-->
|
||||
|
@ -31,3 +31,5 @@ The dictionary configuration has the following structure:
|
||||
- [layout](external_dicts_dict_layout.md#dicts-external_dicts_dict_layout) — Dictionary layout in memory.
|
||||
- [structure](external_dicts_dict_structure.md#dicts-external_dicts_dict_structure) — Structure of the dictionary . A key and attributes that can be retrieved by this key.
|
||||
- [lifetime](external_dicts_dict_lifetime.md#dicts-external_dicts_dict_lifetime) — Frequency of dictionary updates.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict/) <!--hide-->
|
||||
|
@ -292,3 +292,5 @@ dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
|
||||
Other types are not supported yet. The function returns the attribute for the prefix that corresponds to this IP address. If there are overlapping prefixes, the most specific one is returned.
|
||||
|
||||
Data is stored in a `trie`. It must completely fit into RAM.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_layout/) <!--hide-->
|
||||
|
@ -57,3 +57,5 @@ Example of settings:
|
||||
</dictionary>
|
||||
```
|
||||
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_lifetime/) <!--hide-->
|
||||
|
@ -427,3 +427,5 @@ Setting fields:
|
||||
- `password` – Password of the MongoDB user.
|
||||
- `db` – Name of the database.
|
||||
- `collection` – Name of the collection.
|
||||
|
||||
[Original article](https://clickhouse.yandex/docs/en/query_language/dicts/external_dicts_dict_sources/) <!--hide-->
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user