mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-25 17:12:03 +00:00
Merge pull request #59914 from fenik17/docs_fix_typos
[Docs] Fix some typos and missing commas
This commit is contained in:
commit
39dbb33eaf
@ -166,11 +166,11 @@ For most external applications, we recommend using the HTTP interface because it
|
|||||||
|
|
||||||
## Configuration {#configuration}
|
## Configuration {#configuration}
|
||||||
|
|
||||||
ClickHouse Server is based on POCO C++ Libraries and uses `Poco::Util::AbstractConfiguration` to represent it's configuration. Configuration is held by `Poco::Util::ServerApplication` class inherited by `DaemonBase` class, which in turn is inherited by `DB::Server` class, implementing clickhouse-server itself. So config can be accessed by `ServerApplication::config()` method.
|
ClickHouse Server is based on POCO C++ Libraries and uses `Poco::Util::AbstractConfiguration` to represent its configuration. Configuration is held by `Poco::Util::ServerApplication` class inherited by `DaemonBase` class, which in turn is inherited by `DB::Server` class, implementing clickhouse-server itself. So config can be accessed by `ServerApplication::config()` method.
|
||||||
|
|
||||||
Config is read from multiple files (in XML or YAML format) and merged into single `AbstractConfiguration` by `ConfigProcessor` class. Configuration is loaded at server startup and can be reloaded later if one of config files is updated, removed or added. `ConfigReloader` class is responsible for periodic monitoring of these changes and reload procedure as well. `SYSTEM RELOAD CONFIG` query also triggers config to be reloaded.
|
Config is read from multiple files (in XML or YAML format) and merged into single `AbstractConfiguration` by `ConfigProcessor` class. Configuration is loaded at server startup and can be reloaded later if one of config files is updated, removed or added. `ConfigReloader` class is responsible for periodic monitoring of these changes and reload procedure as well. `SYSTEM RELOAD CONFIG` query also triggers config to be reloaded.
|
||||||
|
|
||||||
For queries and subsystems other than `Server` config is accessible using `Context::getConfigRef()` method. Every subsystem that is capable of reloading it's config without server restart should register itself in reload callback in `Server::main()` method. Note that if newer config has an error, most subsystems will ignore new config, log warning messages and keep working with previously loaded config. Due to the nature of `AbstractConfiguration` it is not possible to pass reference to specific section, so `String config_prefix` is usually used instead.
|
For queries and subsystems other than `Server` config is accessible using `Context::getConfigRef()` method. Every subsystem that is capable of reloading its config without server restart should register itself in reload callback in `Server::main()` method. Note that if newer config has an error, most subsystems will ignore new config, log warning messages and keep working with previously loaded config. Due to the nature of `AbstractConfiguration` it is not possible to pass reference to specific section, so `String config_prefix` is usually used instead.
|
||||||
|
|
||||||
## Threads and jobs {#threads-and-jobs}
|
## Threads and jobs {#threads-and-jobs}
|
||||||
|
|
||||||
@ -255,7 +255,7 @@ When we are going to read something from a part in `MergeTree`, we look at `prim
|
|||||||
|
|
||||||
When you `INSERT` a bunch of data into `MergeTree`, that bunch is sorted by primary key order and forms a new part. There are background threads that periodically select some parts and merge them into a single sorted part to keep the number of parts relatively low. That’s why it is called `MergeTree`. Of course, merging leads to “write amplification”. All parts are immutable: they are only created and deleted, but not modified. When SELECT is executed, it holds a snapshot of the table (a set of parts). After merging, we also keep old parts for some time to make a recovery after failure easier, so if we see that some merged part is probably broken, we can replace it with its source parts.
|
When you `INSERT` a bunch of data into `MergeTree`, that bunch is sorted by primary key order and forms a new part. There are background threads that periodically select some parts and merge them into a single sorted part to keep the number of parts relatively low. That’s why it is called `MergeTree`. Of course, merging leads to “write amplification”. All parts are immutable: they are only created and deleted, but not modified. When SELECT is executed, it holds a snapshot of the table (a set of parts). After merging, we also keep old parts for some time to make a recovery after failure easier, so if we see that some merged part is probably broken, we can replace it with its source parts.
|
||||||
|
|
||||||
`MergeTree` is not an LSM tree because it does not contain MEMTABLE and LOG: inserted data is written directly to the filesystem. This behavior makes MergeTree much more suitable to insert data in batches. Therefore frequently inserting small amounts of rows is not ideal for MergeTree. For example, a couple of rows per second is OK, but doing it a thousand times a second is not optimal for MergeTree. However, there is an async insert mode for small inserts to overcome this limitation. We did it this way for simplicity’s sake, and because we are already inserting data in batches in our applications
|
`MergeTree` is not an LSM tree because it does not contain MEMTABLE and LOG: inserted data is written directly to the filesystem. This behavior makes MergeTree much more suitable to insert data in batches. Therefore, frequently inserting small amounts of rows is not ideal for MergeTree. For example, a couple of rows per second is OK, but doing it a thousand times a second is not optimal for MergeTree. However, there is an async insert mode for small inserts to overcome this limitation. We did it this way for simplicity’s sake, and because we are already inserting data in batches in our applications
|
||||||
|
|
||||||
There are MergeTree engines that are doing additional work during background merges. Examples are `CollapsingMergeTree` and `AggregatingMergeTree`. This could be treated as special support for updates. Keep in mind that these are not real updates because users usually have no control over the time when background merges are executed, and data in a `MergeTree` table is almost always stored in more than one part, not in completely merged form.
|
There are MergeTree engines that are doing additional work during background merges. Examples are `CollapsingMergeTree` and `AggregatingMergeTree`. This could be treated as special support for updates. Keep in mind that these are not real updates because users usually have no control over the time when background merges are executed, and data in a `MergeTree` table is almost always stored in more than one part, not in completely merged form.
|
||||||
|
|
||||||
|
@ -38,7 +38,7 @@ ninja
|
|||||||
|
|
||||||
## Running
|
## Running
|
||||||
|
|
||||||
Once built, the binary can be run with, eg.:
|
Once built, the binary can be run with, e.g.:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
qemu-s390x-static -L /usr/s390x-linux-gnu ./clickhouse
|
qemu-s390x-static -L /usr/s390x-linux-gnu ./clickhouse
|
||||||
|
@ -95,7 +95,7 @@ Complete below three steps mentioned in [Star Schema Benchmark](https://clickhou
|
|||||||
- Inserting data. Here should use `./benchmark_sample/rawdata_dir/ssb-dbgen/*.tbl` as input data.
|
- Inserting data. Here should use `./benchmark_sample/rawdata_dir/ssb-dbgen/*.tbl` as input data.
|
||||||
- Converting “star schema” to de-normalized “flat schema”
|
- Converting “star schema” to de-normalized “flat schema”
|
||||||
|
|
||||||
Set up database with with IAA Deflate codec
|
Set up database with IAA Deflate codec
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/deflate
|
$ cd ./database_dir/deflate
|
||||||
@ -104,7 +104,7 @@ $ [CLICKHOUSE_EXE] client
|
|||||||
```
|
```
|
||||||
Complete three steps same as lz4 above
|
Complete three steps same as lz4 above
|
||||||
|
|
||||||
Set up database with with ZSTD codec
|
Set up database with ZSTD codec
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/zstd
|
$ cd ./database_dir/zstd
|
||||||
|
@ -13,7 +13,7 @@ ClickHouse utilizes third-party libraries for different purposes, e.g., to conne
|
|||||||
SELECT library_name, license_type, license_path FROM system.licenses ORDER BY library_name COLLATE 'en';
|
SELECT library_name, license_type, license_path FROM system.licenses ORDER BY library_name COLLATE 'en';
|
||||||
```
|
```
|
||||||
|
|
||||||
(Note that the listed libraries are the ones located in the `contrib/` directory of the ClickHouse repository. Depending on the build options, some of of the libraries may have not been compiled, and as a result, their functionality may not be available at runtime.
|
Note that the listed libraries are the ones located in the `contrib/` directory of the ClickHouse repository. Depending on the build options, some of the libraries may have not been compiled, and as a result, their functionality may not be available at runtime.
|
||||||
|
|
||||||
[Example](https://play.clickhouse.com/play?user=play#U0VMRUNUIGxpYnJhcnlfbmFtZSwgbGljZW5zZV90eXBlLCBsaWNlbnNlX3BhdGggRlJPTSBzeXN0ZW0ubGljZW5zZXMgT1JERVIgQlkgbGlicmFyeV9uYW1lIENPTExBVEUgJ2VuJw==)
|
[Example](https://play.clickhouse.com/play?user=play#U0VMRUNUIGxpYnJhcnlfbmFtZSwgbGljZW5zZV90eXBlLCBsaWNlbnNlX3BhdGggRlJPTSBzeXN0ZW0ubGljZW5zZXMgT1JERVIgQlkgbGlicmFyeV9uYW1lIENPTExBVEUgJ2VuJw==)
|
||||||
|
|
||||||
|
@ -7,13 +7,13 @@ description: Prerequisites and an overview of how to build ClickHouse
|
|||||||
|
|
||||||
# Getting Started Guide for Building ClickHouse
|
# Getting Started Guide for Building ClickHouse
|
||||||
|
|
||||||
ClickHouse can be build on Linux, FreeBSD and macOS. If you use Windows, you can still build ClickHouse in a virtual machine running Linux, e.g. [VirtualBox](https://www.virtualbox.org/) with Ubuntu.
|
ClickHouse can be built on Linux, FreeBSD and macOS. If you use Windows, you can still build ClickHouse in a virtual machine running Linux, e.g. [VirtualBox](https://www.virtualbox.org/) with Ubuntu.
|
||||||
|
|
||||||
ClickHouse requires a 64-bit system to compile and run, 32-bit systems do not work.
|
ClickHouse requires a 64-bit system to compile and run, 32-bit systems do not work.
|
||||||
|
|
||||||
## Creating a Repository on GitHub {#creating-a-repository-on-github}
|
## Creating a Repository on GitHub {#creating-a-repository-on-github}
|
||||||
|
|
||||||
To start developing for ClickHouse you will need a [GitHub](https://www.virtualbox.org/) account. Please also generate a SSH key locally (if you don't have one already) and upload the public key to GitHub as this is a prerequisite for contributing patches.
|
To start developing for ClickHouse you will need a [GitHub](https://www.virtualbox.org/) account. Please also generate an SSH key locally (if you don't have one already) and upload the public key to GitHub as this is a prerequisite for contributing patches.
|
||||||
|
|
||||||
Next, create a fork of the [ClickHouse repository](https://github.com/ClickHouse/ClickHouse/) in your personal account by clicking the "fork" button in the upper right corner.
|
Next, create a fork of the [ClickHouse repository](https://github.com/ClickHouse/ClickHouse/) in your personal account by clicking the "fork" button in the upper right corner.
|
||||||
|
|
||||||
@ -37,7 +37,7 @@ git clone git@github.com:your_github_username/ClickHouse.git # replace placehol
|
|||||||
cd ClickHouse
|
cd ClickHouse
|
||||||
```
|
```
|
||||||
|
|
||||||
This command creates a directory `ClickHouse/` containing the source code of ClickHouse. If you specify a custom checkout directory after the URL but it is important that this path does not contain whitespaces as it may lead to problems with the build later on.
|
This command creates a directory `ClickHouse/` containing the source code of ClickHouse. If you specify a custom checkout directory after the URL, but it is important that this path does not contain whitespaces as it may lead to problems with the build later on.
|
||||||
|
|
||||||
The ClickHouse repository uses Git submodules, i.e. references to external repositories (usually 3rd party libraries used by ClickHouse). These are not checked out by default. To do so, you can either
|
The ClickHouse repository uses Git submodules, i.e. references to external repositories (usually 3rd party libraries used by ClickHouse). These are not checked out by default. To do so, you can either
|
||||||
|
|
||||||
@ -45,7 +45,7 @@ The ClickHouse repository uses Git submodules, i.e. references to external repos
|
|||||||
|
|
||||||
- if `git clone` did not check out submodules, run `git submodule update --init --jobs <N>` (e.g. `<N> = 12` to parallelize the checkout) to achieve the same as the previous alternative, or
|
- if `git clone` did not check out submodules, run `git submodule update --init --jobs <N>` (e.g. `<N> = 12` to parallelize the checkout) to achieve the same as the previous alternative, or
|
||||||
|
|
||||||
- if `git clone` did not check out submodules and you like to use [sparse](https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/) and [shallow](https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/) submodule checkout to omit unneeded files and history in submodules to save space (ca. 5 GB instead of ca. 15 GB), run `./contrib/update-submodules.sh`. Not really recommended as it generally makes working with submodules less convenient and slower.
|
- if `git clone` did not check out submodules, and you like to use [sparse](https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/) and [shallow](https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/) submodule checkout to omit unneeded files and history in submodules to save space (ca. 5 GB instead of ca. 15 GB), run `./contrib/update-submodules.sh`. Not really recommended as it generally makes working with submodules less convenient and slower.
|
||||||
|
|
||||||
You can check the Git status with the command: `git submodule status`.
|
You can check the Git status with the command: `git submodule status`.
|
||||||
|
|
||||||
@ -143,7 +143,7 @@ When a large amount of RAM is available on build machine you should limit the nu
|
|||||||
|
|
||||||
On machines with 4GB of RAM, it is recommended to specify 1, for 8GB of RAM `-j 2` is recommended.
|
On machines with 4GB of RAM, it is recommended to specify 1, for 8GB of RAM `-j 2` is recommended.
|
||||||
|
|
||||||
If you get the message: `ninja: error: loading 'build.ninja': No such file or directory`, it means that generating a build configuration has failed and you need to inspect the message above.
|
If you get the message: `ninja: error: loading 'build.ninja': No such file or directory`, it means that generating a build configuration has failed, and you need to inspect the message above.
|
||||||
|
|
||||||
Upon the successful start of the building process, you’ll see the build progress - the number of processed tasks and the total number of tasks.
|
Upon the successful start of the building process, you’ll see the build progress - the number of processed tasks and the total number of tasks.
|
||||||
|
|
||||||
@ -184,7 +184,7 @@ You can also run your custom-built ClickHouse binary with the config file from t
|
|||||||
|
|
||||||
**CLion (recommended)**
|
**CLion (recommended)**
|
||||||
|
|
||||||
If you do not know which IDE to use, we recommend that you use [CLion](https://www.jetbrains.com/clion/). CLion is commercial software but it offers a 30 day free trial. It is also free of charge for students. CLion can be used on both Linux and macOS.
|
If you do not know which IDE to use, we recommend that you use [CLion](https://www.jetbrains.com/clion/). CLion is commercial software, but it offers a 30 day free trial. It is also free of charge for students. CLion can be used on both Linux and macOS.
|
||||||
|
|
||||||
A few things to know when using CLion to develop ClickHouse:
|
A few things to know when using CLion to develop ClickHouse:
|
||||||
|
|
||||||
|
@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
Nearest neighborhood search is the problem of finding the M closest points for a given point in an N-dimensional vector space. The most
|
Nearest neighborhood search is the problem of finding the M closest points for a given point in an N-dimensional vector space. The most
|
||||||
straightforward approach to solve this problem is a brute force search where the distance between all points in the vector space and the
|
straightforward approach to solve this problem is a brute force search where the distance between all points in the vector space and the
|
||||||
reference point is computed. This method guarantees perfect accuracy but it is usually too slow for practical applications. Thus, nearest
|
reference point is computed. This method guarantees perfect accuracy, but it is usually too slow for practical applications. Thus, nearest
|
||||||
neighborhood search problems are often solved with [approximative algorithms](https://github.com/erikbern/ann-benchmarks). Approximative
|
neighborhood search problems are often solved with [approximative algorithms](https://github.com/erikbern/ann-benchmarks). Approximative
|
||||||
nearest neighborhood search techniques, in conjunction with [embedding
|
nearest neighborhood search techniques, in conjunction with [embedding
|
||||||
methods](https://cloud.google.com/architecture/overview-extracting-and-serving-feature-embeddings-for-machine-learning) allow to search huge
|
methods](https://cloud.google.com/architecture/overview-extracting-and-serving-feature-embeddings-for-machine-learning) allow to search huge
|
||||||
@ -24,7 +24,7 @@ LIMIT N
|
|||||||
|
|
||||||
`vectors` contains N-dimensional values of type [Array](../../../sql-reference/data-types/array.md) or
|
`vectors` contains N-dimensional values of type [Array](../../../sql-reference/data-types/array.md) or
|
||||||
[Tuple](../../../sql-reference/data-types/tuple.md), for example embeddings. Function `Distance` computes the distance between two vectors.
|
[Tuple](../../../sql-reference/data-types/tuple.md), for example embeddings. Function `Distance` computes the distance between two vectors.
|
||||||
Often, the the Euclidean (L2) distance is chosen as distance function but [other
|
Often, the Euclidean (L2) distance is chosen as distance function but [other
|
||||||
distance functions](/docs/en/sql-reference/functions/distance-functions.md) are also possible. `Point` is the reference point, e.g. `(0.17,
|
distance functions](/docs/en/sql-reference/functions/distance-functions.md) are also possible. `Point` is the reference point, e.g. `(0.17,
|
||||||
0.33, ...)`, and `N` limits the number of search results.
|
0.33, ...)`, and `N` limits the number of search results.
|
||||||
|
|
||||||
@ -109,7 +109,7 @@ clickhouse-client --param_vec='hello' --query="SELECT * FROM table_with_ann_inde
|
|||||||
|
|
||||||
**Restrictions**: Queries that contain both a `WHERE Distance(vectors, Point) < MaxDistance` and an `ORDER BY Distance(vectors, Point)`
|
**Restrictions**: Queries that contain both a `WHERE Distance(vectors, Point) < MaxDistance` and an `ORDER BY Distance(vectors, Point)`
|
||||||
clause cannot use ANN indexes. Also, the approximate algorithms used to determine the nearest neighbors require a limit, hence queries
|
clause cannot use ANN indexes. Also, the approximate algorithms used to determine the nearest neighbors require a limit, hence queries
|
||||||
without `LIMIT` clause cannot utilize ANN indexes. Also ANN indexes are only used if the query has a `LIMIT` value smaller than setting
|
without `LIMIT` clause cannot utilize ANN indexes. Also, ANN indexes are only used if the query has a `LIMIT` value smaller than setting
|
||||||
`max_limit_for_ann_queries` (default: 1 million rows). This is a safeguard to prevent large memory allocations by external libraries for
|
`max_limit_for_ann_queries` (default: 1 million rows). This is a safeguard to prevent large memory allocations by external libraries for
|
||||||
approximate neighbor search.
|
approximate neighbor search.
|
||||||
|
|
||||||
@ -120,9 +120,9 @@ then each indexed block will contain 16384 rows. However, data structures and al
|
|||||||
provided by external libraries) are inherently row-oriented. They store a compact representation of a set of rows and also return rows for
|
provided by external libraries) are inherently row-oriented. They store a compact representation of a set of rows and also return rows for
|
||||||
ANN queries. This causes some rather unintuitive differences in the way ANN indexes behave compared to normal skip indexes.
|
ANN queries. This causes some rather unintuitive differences in the way ANN indexes behave compared to normal skip indexes.
|
||||||
|
|
||||||
When a user defines a ANN index on a column, ClickHouse internally creates a ANN "sub-index" for each index block. The sub-index is "local"
|
When a user defines an ANN index on a column, ClickHouse internally creates an ANN "sub-index" for each index block. The sub-index is "local"
|
||||||
in the sense that it only knows about the rows of its containing index block. In the previous example and assuming that a column has 65536
|
in the sense that it only knows about the rows of its containing index block. In the previous example and assuming that a column has 65536
|
||||||
rows, we obtain four index blocks (spanning eight granules) and a ANN sub-index for each index block. A sub-index is theoretically able to
|
rows, we obtain four index blocks (spanning eight granules) and an ANN sub-index for each index block. A sub-index is theoretically able to
|
||||||
return the rows with the N closest points within its index block directly. However, since ClickHouse loads data from disk to memory at the
|
return the rows with the N closest points within its index block directly. However, since ClickHouse loads data from disk to memory at the
|
||||||
granularity of granules, sub-indexes extrapolate matching rows to granule granularity. This is different from regular skip indexes which
|
granularity of granules, sub-indexes extrapolate matching rows to granule granularity. This is different from regular skip indexes which
|
||||||
skip data at the granularity of index blocks.
|
skip data at the granularity of index blocks.
|
||||||
@ -231,7 +231,7 @@ The Annoy index currently does not work with per-table, non-default `index_granu
|
|||||||
|
|
||||||
## USearch {#usearch}
|
## USearch {#usearch}
|
||||||
|
|
||||||
This type of ANN index is based on the [the USearch library](https://github.com/unum-cloud/usearch), which implements the [HNSW
|
This type of ANN index is based on the [USearch library](https://github.com/unum-cloud/usearch), which implements the [HNSW
|
||||||
algorithm](https://arxiv.org/abs/1603.09320), i.e., builds a hierarchical graph where each point represents a vector and the edges represent
|
algorithm](https://arxiv.org/abs/1603.09320), i.e., builds a hierarchical graph where each point represents a vector and the edges represent
|
||||||
similarity. Such hierarchical structures can be very efficient on large collections. They may often fetch 0.05% or less data from the
|
similarity. Such hierarchical structures can be very efficient on large collections. They may often fetch 0.05% or less data from the
|
||||||
overall dataset, while still providing 99% recall. This is especially useful when working with high-dimensional vectors,
|
overall dataset, while still providing 99% recall. This is especially useful when working with high-dimensional vectors,
|
||||||
|
@ -125,7 +125,7 @@ For each resulting data part ClickHouse saves:
|
|||||||
3. The first “cancel” row, if there are more “cancel” rows than “state” rows.
|
3. The first “cancel” row, if there are more “cancel” rows than “state” rows.
|
||||||
4. None of the rows, in all other cases.
|
4. None of the rows, in all other cases.
|
||||||
|
|
||||||
Also when there are at least 2 more “state” rows than “cancel” rows, or at least 2 more “cancel” rows then “state” rows, the merge continues, but ClickHouse treats this situation as a logical error and records it in the server log. This error can occur if the same data were inserted more than once.
|
Also, when there are at least 2 more “state” rows than “cancel” rows, or at least 2 more “cancel” rows then “state” rows, the merge continues, but ClickHouse treats this situation as a logical error and records it in the server log. This error can occur if the same data were inserted more than once.
|
||||||
|
|
||||||
Thus, collapsing should not change the results of calculating statistics.
|
Thus, collapsing should not change the results of calculating statistics.
|
||||||
Changes gradually collapsed so that in the end only the last state of almost every object left.
|
Changes gradually collapsed so that in the end only the last state of almost every object left.
|
||||||
@ -196,7 +196,7 @@ What do we see and where is collapsing?
|
|||||||
|
|
||||||
With two `INSERT` queries, we created 2 data parts. The `SELECT` query was performed in 2 threads, and we got a random order of rows. Collapsing not occurred because there was no merge of the data parts yet. ClickHouse merges data part in an unknown moment which we can not predict.
|
With two `INSERT` queries, we created 2 data parts. The `SELECT` query was performed in 2 threads, and we got a random order of rows. Collapsing not occurred because there was no merge of the data parts yet. ClickHouse merges data part in an unknown moment which we can not predict.
|
||||||
|
|
||||||
Thus we need aggregation:
|
Thus, we need aggregation:
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
SELECT
|
SELECT
|
||||||
|
@ -72,7 +72,7 @@ Specifying the `sharding_key` is necessary for the following:
|
|||||||
|
|
||||||
#### fsync_directories
|
#### fsync_directories
|
||||||
|
|
||||||
`fsync_directories` - do the `fsync` for directories. Guarantees that the OS refreshed directory metadata after operations related to background inserts on Distributed table (after insert, after sending the data to shard, etc).
|
`fsync_directories` - do the `fsync` for directories. Guarantees that the OS refreshed directory metadata after operations related to background inserts on Distributed table (after insert, after sending the data to shard, etc.).
|
||||||
|
|
||||||
#### bytes_to_throw_insert
|
#### bytes_to_throw_insert
|
||||||
|
|
||||||
@ -220,7 +220,7 @@ Second, you can perform `INSERT` statements on a `Distributed` table. In this ca
|
|||||||
|
|
||||||
Each shard can have a `<weight>` defined in the config file. By default, the weight is `1`. Data is distributed across shards in the amount proportional to the shard weight. All shard weights are summed up, then each shard's weight is divided by the total to determine each shard's proportion. For example, if there are two shards and the first has a weight of 1 while the second has a weight of 2, the first will be sent one third (1 / 3) of inserted rows and the second will be sent two thirds (2 / 3).
|
Each shard can have a `<weight>` defined in the config file. By default, the weight is `1`. Data is distributed across shards in the amount proportional to the shard weight. All shard weights are summed up, then each shard's weight is divided by the total to determine each shard's proportion. For example, if there are two shards and the first has a weight of 1 while the second has a weight of 2, the first will be sent one third (1 / 3) of inserted rows and the second will be sent two thirds (2 / 3).
|
||||||
|
|
||||||
Each shard can have the `internal_replication` parameter defined in the config file. If this parameter is set to `true`, the write operation selects the first healthy replica and writes data to it. Use this if the tables underlying the `Distributed` table are replicated tables (e.g. any of the `Replicated*MergeTree` table engines). One of the table replicas will receive the write and it will be replicated to the other replicas automatically.
|
Each shard can have the `internal_replication` parameter defined in the config file. If this parameter is set to `true`, the write operation selects the first healthy replica and writes data to it. Use this if the tables underlying the `Distributed` table are replicated tables (e.g. any of the `Replicated*MergeTree` table engines). One of the table replicas will receive the write, and it will be replicated to the other replicas automatically.
|
||||||
|
|
||||||
If `internal_replication` is set to `false` (the default), data is written to all replicas. In this case, the `Distributed` table replicates data itself. This is worse than using replicated tables because the consistency of replicas is not checked and, over time, they will contain slightly different data.
|
If `internal_replication` is set to `false` (the default), data is written to all replicas. In this case, the `Distributed` table replicates data itself. This is worse than using replicated tables because the consistency of replicas is not checked and, over time, they will contain slightly different data.
|
||||||
|
|
||||||
|
@ -12,7 +12,7 @@ The queries below were executed on a **Production** instance of [ClickHouse Clou
|
|||||||
:::
|
:::
|
||||||
|
|
||||||
|
|
||||||
1. Without inserting the data into ClickHouse, we can query it in place. Let's grab some rows so we can see what they look like:
|
1. Without inserting the data into ClickHouse, we can query it in place. Let's grab some rows, so we can see what they look like:
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
SELECT *
|
SELECT *
|
||||||
|
@ -29,7 +29,7 @@ Here is a preview of the dashboard created in this guide:
|
|||||||
|
|
||||||
This dataset is from [OpenCelliD](https://www.opencellid.org/) - The world's largest Open Database of Cell Towers.
|
This dataset is from [OpenCelliD](https://www.opencellid.org/) - The world's largest Open Database of Cell Towers.
|
||||||
|
|
||||||
As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc).
|
As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc.).
|
||||||
|
|
||||||
OpenCelliD Project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, and we redistribute a snapshot of this dataset under the terms of the same license. The up-to-date version of the dataset is available to download after sign in.
|
OpenCelliD Project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, and we redistribute a snapshot of this dataset under the terms of the same license. The up-to-date version of the dataset is available to download after sign in.
|
||||||
|
|
||||||
@ -355,7 +355,7 @@ Click on **UPDATE CHART** to render the visualization.
|
|||||||
|
|
||||||
### Add the charts to a **dashboard**
|
### Add the charts to a **dashboard**
|
||||||
|
|
||||||
This screenshot shows cell tower locations with LTE, UMTS, and GSM radios. The charts are all created in the same way and they are added to a dashboard.
|
This screenshot shows cell tower locations with LTE, UMTS, and GSM radios. The charts are all created in the same way, and they are added to a dashboard.
|
||||||
|
|
||||||
![Dashboard of cell towers by radio type in mcc 204](@site/docs/en/getting-started/example-datasets/images/superset-cell-tower-dashboard.png)
|
![Dashboard of cell towers by radio type in mcc 204](@site/docs/en/getting-started/example-datasets/images/superset-cell-tower-dashboard.png)
|
||||||
|
|
||||||
|
@ -132,7 +132,7 @@ FROM covid19;
|
|||||||
└────────────────────────────────────────────┘
|
└────────────────────────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
7. You will notice the data has a lot of 0's for dates - either weekends or days where numbers were not reported each day. We can use a window function to smooth out the daily averages of new cases:
|
7. You will notice the data has a lot of 0's for dates - either weekends or days when numbers were not reported each day. We can use a window function to smooth out the daily averages of new cases:
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
SELECT
|
SELECT
|
||||||
@ -262,4 +262,4 @@ The results look like
|
|||||||
|
|
||||||
:::note
|
:::note
|
||||||
As mentioned in the [GitHub repo](https://github.com/GoogleCloudPlatform/covid-19-open-data), the dataset is no longer updated as of September 15, 2022.
|
As mentioned in the [GitHub repo](https://github.com/GoogleCloudPlatform/covid-19-open-data), the dataset is no longer updated as of September 15, 2022.
|
||||||
:::
|
:::
|
||||||
|
@ -243,7 +243,7 @@ If no database is specified, the `default` database will be used.
|
|||||||
|
|
||||||
If the user name, password or database was specified in the connection string, it cannot be specified using `--user`, `--password` or `--database` (and vice versa).
|
If the user name, password or database was specified in the connection string, it cannot be specified using `--user`, `--password` or `--database` (and vice versa).
|
||||||
|
|
||||||
The host component can either be an a host name and IP address. Put an IPv6 address in square brackets to specify it:
|
The host component can either be a host name and IP address. Put an IPv6 address in square brackets to specify it:
|
||||||
|
|
||||||
```text
|
```text
|
||||||
clickhouse://[2001:db8::1234]
|
clickhouse://[2001:db8::1234]
|
||||||
|
@ -33,7 +33,7 @@ The supported formats are:
|
|||||||
| [JSONAsString](#jsonasstring) | ✔ | ✗ |
|
| [JSONAsString](#jsonasstring) | ✔ | ✗ |
|
||||||
| [JSONStrings](#jsonstrings) | ✔ | ✔ |
|
| [JSONStrings](#jsonstrings) | ✔ | ✔ |
|
||||||
| [JSONColumns](#jsoncolumns) | ✔ | ✔ |
|
| [JSONColumns](#jsoncolumns) | ✔ | ✔ |
|
||||||
| [JSONColumnsWithMetadata](#jsoncolumnsmonoblock)) | ✔ | ✔ |
|
| [JSONColumnsWithMetadata](#jsoncolumnsmonoblock) | ✔ | ✔ |
|
||||||
| [JSONCompact](#jsoncompact) | ✔ | ✔ |
|
| [JSONCompact](#jsoncompact) | ✔ | ✔ |
|
||||||
| [JSONCompactStrings](#jsoncompactstrings) | ✗ | ✔ |
|
| [JSONCompactStrings](#jsoncompactstrings) | ✗ | ✔ |
|
||||||
| [JSONCompactColumns](#jsoncompactcolumns) | ✔ | ✔ |
|
| [JSONCompactColumns](#jsoncompactcolumns) | ✔ | ✔ |
|
||||||
|
@ -6,7 +6,7 @@ sidebar_label: Configuration Files
|
|||||||
|
|
||||||
# Configuration Files
|
# Configuration Files
|
||||||
|
|
||||||
The ClickHouse server can be configured with configuration files in XML or YAML syntax. In most installation types, the ClickHouse server runs with `/etc/clickhouse-server/config.xml` as default configuration file but it is also possible to specify the location of the configuration file manually at server startup using command line option `--config-file=` or `-C`. Additional configuration files may be placed into directory `config.d/` relative to the main configuration file, for example into directory `/etc/clickhouse-server/config.d/`. Files in this directory and the main configuration are merged in a preprocessing step before the configuration is applied in ClickHouse server. Configuration files are merged in alphabetical order. To simplify updates and improve modularization, it is best practice to keep the default `config.xml` file unmodified and place additional customization into `config.d/`.
|
The ClickHouse server can be configured with configuration files in XML or YAML syntax. In most installation types, the ClickHouse server runs with `/etc/clickhouse-server/config.xml` as default configuration file, but it is also possible to specify the location of the configuration file manually at server startup using command line option `--config-file=` or `-C`. Additional configuration files may be placed into directory `config.d/` relative to the main configuration file, for example into directory `/etc/clickhouse-server/config.d/`. Files in this directory and the main configuration are merged in a preprocessing step before the configuration is applied in ClickHouse server. Configuration files are merged in alphabetical order. To simplify updates and improve modularization, it is best practice to keep the default `config.xml` file unmodified and place additional customization into `config.d/`.
|
||||||
|
|
||||||
It is possible to mix XML and YAML configuration files, for example you could have a main configuration file `config.xml` and additional configuration files `config.d/network.xml`, `config.d/timezone.yaml` and `config.d/keeper.yaml`. Mixing XML and YAML within a single configuration file is not supported. XML configuration files should use `<clickhouse>...</clickhouse>` as top-level tag. In YAML configuration files, `clickhouse:` is optional, the parser inserts it implicitly if absent.
|
It is possible to mix XML and YAML configuration files, for example you could have a main configuration file `config.xml` and additional configuration files `config.d/network.xml`, `config.d/timezone.yaml` and `config.d/keeper.yaml`. Mixing XML and YAML within a single configuration file is not supported. XML configuration files should use `<clickhouse>...</clickhouse>` as top-level tag. In YAML configuration files, `clickhouse:` is optional, the parser inserts it implicitly if absent.
|
||||||
|
|
||||||
@ -63,7 +63,7 @@ XML substitution example:
|
|||||||
</clickhouse>
|
</clickhouse>
|
||||||
```
|
```
|
||||||
|
|
||||||
Substitutions can also be performed from ZooKeeper. To do this, specify the attribute `from_zk = "/path/to/node"`. The element value is replaced with the contents of the node at `/path/to/node` in ZooKeeper. You can also put an entire XML subtree on the ZooKeeper node and it will be fully inserted into the source element.
|
Substitutions can also be performed from ZooKeeper. To do this, specify the attribute `from_zk = "/path/to/node"`. The element value is replaced with the contents of the node at `/path/to/node` in ZooKeeper. You can also put an entire XML subtree on the ZooKeeper node, and it will be fully inserted into the source element.
|
||||||
|
|
||||||
## Encrypting and Hiding Configuration {#encryption}
|
## Encrypting and Hiding Configuration {#encryption}
|
||||||
|
|
||||||
|
@ -49,6 +49,6 @@ Every job has a pool associated with it and is started in this pool. Each pool h
|
|||||||
|
|
||||||
Time instants during job lifetime:
|
Time instants during job lifetime:
|
||||||
- `schedule_time` (`DateTime64`) - Time when job was created and scheduled to be executed (usually with all its dependencies).
|
- `schedule_time` (`DateTime64`) - Time when job was created and scheduled to be executed (usually with all its dependencies).
|
||||||
- `enqueue_time` (`Nullable(DateTime64)`) - Time when job became ready and was enqueued into a ready queue of it's pool. Null if the job is not ready yet.
|
- `enqueue_time` (`Nullable(DateTime64)`) - Time when job became ready and was enqueued into a ready queue of its pool. Null if the job is not ready yet.
|
||||||
- `start_time` (`Nullable(DateTime64)`) - Time when worker dequeues the job from ready queue and start its execution. Null if the job is not started yet.
|
- `start_time` (`Nullable(DateTime64)`) - Time when worker dequeues the job from ready queue and start its execution. Null if the job is not started yet.
|
||||||
- `finish_time` (`Nullable(DateTime64)`) - Time when job execution is finished. Null if the job is not finished yet.
|
- `finish_time` (`Nullable(DateTime64)`) - Time when job execution is finished. Null if the job is not finished yet.
|
||||||
|
@ -297,11 +297,11 @@ Total number of databases on the server.
|
|||||||
|
|
||||||
### NumberOfDetachedByUserParts
|
### NumberOfDetachedByUserParts
|
||||||
|
|
||||||
The total number of parts detached from MergeTree tables by users with the `ALTER TABLE DETACH` query (as opposed to unexpected, broken or ignored parts). The server does not care about detached parts and they can be removed.
|
The total number of parts detached from MergeTree tables by users with the `ALTER TABLE DETACH` query (as opposed to unexpected, broken or ignored parts). The server does not care about detached parts, and they can be removed.
|
||||||
|
|
||||||
### NumberOfDetachedParts
|
### NumberOfDetachedParts
|
||||||
|
|
||||||
The total number of parts detached from MergeTree tables. A part can be detached by a user with the `ALTER TABLE DETACH` query or by the server itself it the part is broken, unexpected or unneeded. The server does not care about detached parts and they can be removed.
|
The total number of parts detached from MergeTree tables. A part can be detached by a user with the `ALTER TABLE DETACH` query or by the server itself it the part is broken, unexpected or unneeded. The server does not care about detached parts, and they can be removed.
|
||||||
|
|
||||||
### NumberOfTables
|
### NumberOfTables
|
||||||
|
|
||||||
@ -393,7 +393,7 @@ The amount of free memory plus OS page cache memory on the host system, in bytes
|
|||||||
|
|
||||||
### OSMemoryFreeWithoutCached
|
### OSMemoryFreeWithoutCached
|
||||||
|
|
||||||
The amount of free memory on the host system, in bytes. This does not include the memory used by the OS page cache memory, in bytes. The page cache memory is also available for usage by programs, so the value of this metric can be confusing. See the `OSMemoryAvailable` metric instead. For convenience we also provide the `OSMemoryFreePlusCached` metric, that should be somewhat similar to OSMemoryAvailable. See also https://www.linuxatemyram.com/. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
|
The amount of free memory on the host system, in bytes. This does not include the memory used by the OS page cache memory, in bytes. The page cache memory is also available for usage by programs, so the value of this metric can be confusing. See the `OSMemoryAvailable` metric instead. For convenience, we also provide the `OSMemoryFreePlusCached` metric, that should be somewhat similar to OSMemoryAvailable. See also https://www.linuxatemyram.com/. This is a system-wide metric, it includes all the processes on the host machine, not just clickhouse-server.
|
||||||
|
|
||||||
### OSMemoryTotal
|
### OSMemoryTotal
|
||||||
|
|
||||||
@ -493,7 +493,7 @@ Number of threads in the server of the PostgreSQL compatibility protocol.
|
|||||||
|
|
||||||
### QueryCacheBytes
|
### QueryCacheBytes
|
||||||
|
|
||||||
Total size of the query cache cache in bytes.
|
Total size of the query cache in bytes.
|
||||||
|
|
||||||
### QueryCacheEntries
|
### QueryCacheEntries
|
||||||
|
|
||||||
@ -549,7 +549,7 @@ Total amount of bytes (compressed, including data and indices) stored in all tab
|
|||||||
|
|
||||||
### TotalPartsOfMergeTreeTables
|
### TotalPartsOfMergeTreeTables
|
||||||
|
|
||||||
Total amount of data parts in all tables of MergeTree family. Numbers larger than 10 000 will negatively affect the server startup time and it may indicate unreasonable choice of the partition key.
|
Total amount of data parts in all tables of MergeTree family. Numbers larger than 10 000 will negatively affect the server startup time, and it may indicate unreasonable choice of the partition key.
|
||||||
|
|
||||||
### TotalPrimaryKeyBytesInMemory
|
### TotalPrimaryKeyBytesInMemory
|
||||||
|
|
||||||
|
@ -19,7 +19,7 @@ Columns:
|
|||||||
- `default_database` ([String](../../sql-reference/data-types/string.md)) — The default database name.
|
- `default_database` ([String](../../sql-reference/data-types/string.md)) — The default database name.
|
||||||
- `errors_count` ([UInt32](../../sql-reference/data-types/int-uint.md)) — The number of times this host failed to reach replica.
|
- `errors_count` ([UInt32](../../sql-reference/data-types/int-uint.md)) — The number of times this host failed to reach replica.
|
||||||
- `slowdowns_count` ([UInt32](../../sql-reference/data-types/int-uint.md)) — The number of slowdowns that led to changing replica when establishing a connection with hedged requests.
|
- `slowdowns_count` ([UInt32](../../sql-reference/data-types/int-uint.md)) — The number of slowdowns that led to changing replica when establishing a connection with hedged requests.
|
||||||
- `estimated_recovery_time` ([UInt32](../../sql-reference/data-types/int-uint.md)) — Seconds remaining until the replica error count is zeroed and it is considered to be back to normal.
|
- `estimated_recovery_time` ([UInt32](../../sql-reference/data-types/int-uint.md)) — Seconds remaining until the replica error count is zeroed, and it is considered to be back to normal.
|
||||||
- `database_shard_name` ([String](../../sql-reference/data-types/string.md)) — The name of the `Replicated` database shard (for clusters that belong to a `Replicated` database).
|
- `database_shard_name` ([String](../../sql-reference/data-types/string.md)) — The name of the `Replicated` database shard (for clusters that belong to a `Replicated` database).
|
||||||
- `database_replica_name` ([String](../../sql-reference/data-types/string.md)) — The name of the `Replicated` database replica (for clusters that belong to a `Replicated` database).
|
- `database_replica_name` ([String](../../sql-reference/data-types/string.md)) — The name of the `Replicated` database replica (for clusters that belong to a `Replicated` database).
|
||||||
- `is_active` ([Nullable(UInt8)](../../sql-reference/data-types/int-uint.md)) — The status of the `Replicated` database replica (for clusters that belong to a `Replicated` database): 1 means "replica is online", 0 means "replica is offline", `NULL` means "unknown".
|
- `is_active` ([Nullable(UInt8)](../../sql-reference/data-types/int-uint.md)) — The status of the `Replicated` database replica (for clusters that belong to a `Replicated` database): 1 means "replica is online", 0 means "replica is offline", `NULL` means "unknown".
|
||||||
|
@ -18,7 +18,7 @@ Columns:
|
|||||||
- `LOADED_AND_RELOADING` — Dictionary is loaded successfully, and is being reloaded right now (frequent reasons: [SYSTEM RELOAD DICTIONARY](../../sql-reference/statements/system.md#query_language-system-reload-dictionary) query, timeout, dictionary config has changed).
|
- `LOADED_AND_RELOADING` — Dictionary is loaded successfully, and is being reloaded right now (frequent reasons: [SYSTEM RELOAD DICTIONARY](../../sql-reference/statements/system.md#query_language-system-reload-dictionary) query, timeout, dictionary config has changed).
|
||||||
- `FAILED_AND_RELOADING` — Could not load the dictionary as a result of an error and is loading now.
|
- `FAILED_AND_RELOADING` — Could not load the dictionary as a result of an error and is loading now.
|
||||||
- `origin` ([String](../../sql-reference/data-types/string.md)) — Path to the configuration file that describes the dictionary.
|
- `origin` ([String](../../sql-reference/data-types/string.md)) — Path to the configuration file that describes the dictionary.
|
||||||
- `type` ([String](../../sql-reference/data-types/string.md)) — Type of a dictionary allocation. [Storing Dictionaries in Memory](../../sql-reference/dictionaries/index.md#storig-dictionaries-in-memory).
|
- `type` ([String](../../sql-reference/data-types/string.md)) — Type of dictionary allocation. [Storing Dictionaries in Memory](../../sql-reference/dictionaries/index.md#storig-dictionaries-in-memory).
|
||||||
- `key.names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Array of [key names](../../sql-reference/dictionaries/index.md#dictionary-key-and-fields#ext_dict_structure-key) provided by the dictionary.
|
- `key.names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Array of [key names](../../sql-reference/dictionaries/index.md#dictionary-key-and-fields#ext_dict_structure-key) provided by the dictionary.
|
||||||
- `key.types` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Corresponding array of [key types](../../sql-reference/dictionaries/index.md#dictionary-key-and-fields#ext_dict_structure-key) provided by the dictionary.
|
- `key.types` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Corresponding array of [key types](../../sql-reference/dictionaries/index.md#dictionary-key-and-fields#ext_dict_structure-key) provided by the dictionary.
|
||||||
- `attribute.names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Array of [attribute names](../../sql-reference/dictionaries/index.md#dictionary-key-and-fields#ext_dict_structure-attributes) provided by the dictionary.
|
- `attribute.names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Array of [attribute names](../../sql-reference/dictionaries/index.md#dictionary-key-and-fields#ext_dict_structure-attributes) provided by the dictionary.
|
||||||
|
@ -34,7 +34,7 @@ The binary you just downloaded can run all sorts of ClickHouse tools and utiliti
|
|||||||
|
|
||||||
A common use of `clickhouse-local` is to run ad-hoc queries on files: where you don't have to insert the data into a table. `clickhouse-local` can stream the data from a file into a temporary table and execute your SQL.
|
A common use of `clickhouse-local` is to run ad-hoc queries on files: where you don't have to insert the data into a table. `clickhouse-local` can stream the data from a file into a temporary table and execute your SQL.
|
||||||
|
|
||||||
If the file is sitting on the same machine as `clickhouse-local`, you can simple specify the file to load. The following `reviews.tsv` file contains a sampling of Amazon product reviews:
|
If the file is sitting on the same machine as `clickhouse-local`, you can simply specify the file to load. The following `reviews.tsv` file contains a sampling of Amazon product reviews:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./clickhouse local -q "SELECT * FROM 'reviews.tsv'"
|
./clickhouse local -q "SELECT * FROM 'reviews.tsv'"
|
||||||
@ -220,7 +220,7 @@ Arguments:
|
|||||||
- `--help` — arguments references for `clickhouse-local`.
|
- `--help` — arguments references for `clickhouse-local`.
|
||||||
- `-V`, `--version` — print version information and exit.
|
- `-V`, `--version` — print version information and exit.
|
||||||
|
|
||||||
Also there are arguments for each ClickHouse configuration variable which are more commonly used instead of `--config-file`.
|
Also, there are arguments for each ClickHouse configuration variable which are more commonly used instead of `--config-file`.
|
||||||
|
|
||||||
|
|
||||||
## Examples {#examples}
|
## Examples {#examples}
|
||||||
|
@ -38,7 +38,7 @@ For example, you have a column `IsMobile` in your table with values 0 and 1. In
|
|||||||
|
|
||||||
So, the user will be able to count the exact ratio of mobile traffic.
|
So, the user will be able to count the exact ratio of mobile traffic.
|
||||||
|
|
||||||
Let's give another example. When you have some private data in your table, like user email and you don't want to publish any single email address.
|
Let's give another example. When you have some private data in your table, like user email, and you don't want to publish any single email address.
|
||||||
If your table is large enough and contains multiple different emails and no email has a very high frequency than all others, it will anonymize all data. But if you have a small number of different values in a column, it can reproduce some of them.
|
If your table is large enough and contains multiple different emails and no email has a very high frequency than all others, it will anonymize all data. But if you have a small number of different values in a column, it can reproduce some of them.
|
||||||
You should look at the working algorithm of this tool works, and fine-tune its command line parameters.
|
You should look at the working algorithm of this tool works, and fine-tune its command line parameters.
|
||||||
|
|
||||||
|
@ -9,7 +9,7 @@ Selects the first encountered value of a column.
|
|||||||
|
|
||||||
By default, it ignores NULL values and returns the first NOT NULL value found in the column. As [`first_value`](../../../sql-reference/aggregate-functions/reference/first_value.md) if supports `RESPECT NULLS`, in which case it will select the first value passed, independently on whether it's NULL or not.
|
By default, it ignores NULL values and returns the first NOT NULL value found in the column. As [`first_value`](../../../sql-reference/aggregate-functions/reference/first_value.md) if supports `RESPECT NULLS`, in which case it will select the first value passed, independently on whether it's NULL or not.
|
||||||
|
|
||||||
The return type of the function is the same as the input, except for LowCardinality which is discarded). This means that given no rows as input it will return the default value of that type (0 for integers, or Null for a Nullable() column). You might use the `-OrNull` [combinator](../../../sql-reference/aggregate-functions/combinators.md) ) to modify this behaviour.
|
The return type of the function is the same as the input, except for LowCardinality which is discarded. This means that given no rows as input it will return the default value of that type (0 for integers, or Null for a Nullable() column). You might use the `-OrNull` [combinator](../../../sql-reference/aggregate-functions/combinators.md) ) to modify this behaviour.
|
||||||
|
|
||||||
The query can be executed in any order and even in a different order each time, so the result of this function is indeterminate.
|
The query can be executed in any order and even in a different order each time, so the result of this function is indeterminate.
|
||||||
To get a determinate result, you can use the ‘min’ or ‘max’ function instead of ‘any’.
|
To get a determinate result, you can use the ‘min’ or ‘max’ function instead of ‘any’.
|
||||||
|
@ -20,7 +20,7 @@ contingency(column1, column2)
|
|||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
- a value between 0 to 1. The larger the result, the closer the association of the two columns.
|
- a value between 0 and 1. The larger the result, the closer the association of the two columns.
|
||||||
|
|
||||||
**Return type** is always [Float64](../../../sql-reference/data-types/float.md).
|
**Return type** is always [Float64](../../../sql-reference/data-types/float.md).
|
||||||
|
|
||||||
@ -48,4 +48,4 @@ Result:
|
|||||||
┌──────cramersV(a, b)─┬───contingency(a, b)─┐
|
┌──────cramersV(a, b)─┬───contingency(a, b)─┐
|
||||||
│ 0.41171788506213564 │ 0.05812725261759165 │
|
│ 0.41171788506213564 │ 0.05812725261759165 │
|
||||||
└─────────────────────┴─────────────────────┘
|
└─────────────────────┴─────────────────────┘
|
||||||
```
|
```
|
||||||
|
@ -9,7 +9,7 @@ sidebar_label: DateTime64
|
|||||||
Allows to store an instant in time, that can be expressed as a calendar date and a time of a day, with defined sub-second precision
|
Allows to store an instant in time, that can be expressed as a calendar date and a time of a day, with defined sub-second precision
|
||||||
|
|
||||||
Tick size (precision): 10<sup>-precision</sup> seconds. Valid range: [ 0 : 9 ].
|
Tick size (precision): 10<sup>-precision</sup> seconds. Valid range: [ 0 : 9 ].
|
||||||
Typically are used - 3 (milliseconds), 6 (microseconds), 9 (nanoseconds).
|
Typically, are used - 3 (milliseconds), 6 (microseconds), 9 (nanoseconds).
|
||||||
|
|
||||||
**Syntax:**
|
**Syntax:**
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@ Signed fixed-point numbers that keep precision during add, subtract and multiply
|
|||||||
|
|
||||||
## Parameters
|
## Parameters
|
||||||
|
|
||||||
- P - precision. Valid range: \[ 1 : 76 \]. Determines how many decimal digits number can have (including fraction). By default the precision is 10.
|
- P - precision. Valid range: \[ 1 : 76 \]. Determines how many decimal digits number can have (including fraction). By default, the precision is 10.
|
||||||
- S - scale. Valid range: \[ 0 : P \]. Determines how many decimal digits fraction can have.
|
- S - scale. Valid range: \[ 0 : P \]. Determines how many decimal digits fraction can have.
|
||||||
|
|
||||||
Decimal(P) is equivalent to Decimal(P, 0). Similarly, the syntax Decimal is equivalent to Decimal(10, 0).
|
Decimal(P) is equivalent to Decimal(P, 0). Similarly, the syntax Decimal is equivalent to Decimal(10, 0).
|
||||||
|
@ -6,7 +6,7 @@ sidebar_label: Distributed DDL
|
|||||||
|
|
||||||
# Distributed DDL Queries (ON CLUSTER Clause)
|
# Distributed DDL Queries (ON CLUSTER Clause)
|
||||||
|
|
||||||
By default the `CREATE`, `DROP`, `ALTER`, and `RENAME` queries affect only the current server where they are executed. In a cluster setup, it is possible to run such queries in a distributed manner with the `ON CLUSTER` clause.
|
By default, the `CREATE`, `DROP`, `ALTER`, and `RENAME` queries affect only the current server where they are executed. In a cluster setup, it is possible to run such queries in a distributed manner with the `ON CLUSTER` clause.
|
||||||
|
|
||||||
For example, the following query creates the `all_hits` `Distributed` table on each host in `cluster`:
|
For example, the following query creates the `all_hits` `Distributed` table on each host in `cluster`:
|
||||||
|
|
||||||
|
@ -372,7 +372,7 @@ Result:
|
|||||||
|
|
||||||
## bitmapAnd
|
## bitmapAnd
|
||||||
|
|
||||||
Computes the logical conjunction of two two bitmaps.
|
Computes the logical conjunction of two bitmaps.
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
|
@ -1564,7 +1564,7 @@ Alias: `TO_DAYS`
|
|||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `date` — The date to calculate the number of days passed since year zero from. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
|
- `date` — The date to calculate the number of days passed since year zero from. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
|
||||||
- `time_zone` — A String type const value or a expression represent the time zone. [String types](../../sql-reference/data-types/string.md)
|
- `time_zone` — A String type const value or an expression represent the time zone. [String types](../../sql-reference/data-types/string.md)
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
@ -2218,7 +2218,7 @@ now64([scale], [timezone])
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `scale` - Tick size (precision): 10<sup>-precision</sup> seconds. Valid range: [ 0 : 9 ]. Typically are used - 3 (default) (milliseconds), 6 (microseconds), 9 (nanoseconds).
|
- `scale` - Tick size (precision): 10<sup>-precision</sup> seconds. Valid range: [ 0 : 9 ]. Typically, are used - 3 (default) (milliseconds), 6 (microseconds), 9 (nanoseconds).
|
||||||
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) for the returned value (optional). [String](../../sql-reference/data-types/string.md).
|
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) for the returned value (optional). [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
@ -2305,7 +2305,7 @@ Rounds the time to the half hour.
|
|||||||
|
|
||||||
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY \* 100 + MM). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY \* 100 + MM). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
||||||
|
|
||||||
This functions is the opposite of function `YYYYMMDDToDate()`.
|
This function is the opposite of function `YYYYMMDDToDate()`.
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
@ -2362,7 +2362,7 @@ Result:
|
|||||||
|
|
||||||
Converts a number containing the year, month and day number to a [Date](../../sql-reference/data-types/date.md).
|
Converts a number containing the year, month and day number to a [Date](../../sql-reference/data-types/date.md).
|
||||||
|
|
||||||
This functions is the opposite of function `toYYYYMMDD()`.
|
This function is the opposite of function `toYYYYMMDD()`.
|
||||||
|
|
||||||
The output is undefined if the input does not encode a valid Date value.
|
The output is undefined if the input does not encode a valid Date value.
|
||||||
|
|
||||||
@ -2406,7 +2406,7 @@ Converts a number containing the year, month, day, hours, minute and second numb
|
|||||||
|
|
||||||
The output is undefined if the input does not encode a valid DateTime value.
|
The output is undefined if the input does not encode a valid DateTime value.
|
||||||
|
|
||||||
This functions is the opposite of function `toYYYYMMDDhhmmss()`.
|
This function is the opposite of function `toYYYYMMDDhhmmss()`.
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
@ -2981,8 +2981,8 @@ toUTCTimestamp(time_val, time_zone)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `time_val` — A DateTime/DateTime64 type const value or a expression . [DateTime/DateTime64 types](../../sql-reference/data-types/datetime.md)
|
- `time_val` — A DateTime/DateTime64 type const value or an expression . [DateTime/DateTime64 types](../../sql-reference/data-types/datetime.md)
|
||||||
- `time_zone` — A String type const value or a expression represent the time zone. [String types](../../sql-reference/data-types/string.md)
|
- `time_zone` — A String type const value or an expression represent the time zone. [String types](../../sql-reference/data-types/string.md)
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
@ -3014,8 +3014,8 @@ fromUTCTimestamp(time_val, time_zone)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `time_val` — A DateTime/DateTime64 type const value or a expression . [DateTime/DateTime64 types](../../sql-reference/data-types/datetime.md)
|
- `time_val` — A DateTime/DateTime64 type const value or an expression . [DateTime/DateTime64 types](../../sql-reference/data-types/datetime.md)
|
||||||
- `time_zone` — A String type const value or a expression represent the time zone. [String types](../../sql-reference/data-types/string.md)
|
- `time_zone` — A String type const value or an expression represent the time zone. [String types](../../sql-reference/data-types/string.md)
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@ sidebar_label: APPLY DELETED MASK
|
|||||||
ALTER TABLE [db].name [ON CLUSTER cluster] APPLY DELETED MASK [IN PARTITION partition_id]
|
ALTER TABLE [db].name [ON CLUSTER cluster] APPLY DELETED MASK [IN PARTITION partition_id]
|
||||||
```
|
```
|
||||||
|
|
||||||
The command applies mask created by [lightweight delete](/docs/en/sql-reference/statements/delete) and forcefully removes rows marked as deleted from disk. This command is a heavyweight mutation and it semantically equals to query ```ALTER TABLE [db].name DELETE WHERE _row_exists = 0```.
|
The command applies mask created by [lightweight delete](/docs/en/sql-reference/statements/delete) and forcefully removes rows marked as deleted from disk. This command is a heavyweight mutation, and it semantically equals to query ```ALTER TABLE [db].name DELETE WHERE _row_exists = 0```.
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
It only works for tables in the [`MergeTree`](../../../engines/table-engines/mergetree-family/mergetree.md) family (including [replicated](../../../engines/table-engines/mergetree-family/replication.md) tables).
|
It only works for tables in the [`MergeTree`](../../../engines/table-engines/mergetree-family/mergetree.md) family (including [replicated](../../../engines/table-engines/mergetree-family/replication.md) tables).
|
||||||
|
@ -15,7 +15,7 @@ ALTER TABLE [db].name [ON CLUSTER cluster] DROP CONSTRAINT constraint_name;
|
|||||||
|
|
||||||
See more on [constraints](../../../sql-reference/statements/create/table.md#constraints).
|
See more on [constraints](../../../sql-reference/statements/create/table.md#constraints).
|
||||||
|
|
||||||
Queries will add or remove metadata about constraints from table so they are processed immediately.
|
Queries will add or remove metadata about constraints from table, so they are processed immediately.
|
||||||
|
|
||||||
:::tip
|
:::tip
|
||||||
Constraint check **will not be executed** on existing data if it was added.
|
Constraint check **will not be executed** on existing data if it was added.
|
||||||
|
@ -16,13 +16,13 @@ DETACH TABLE|VIEW|DICTIONARY|DATABASE [IF EXISTS] [db.]name [ON CLUSTER cluster]
|
|||||||
Detaching does not delete the data or metadata of a table, a materialized view, a dictionary or a database. If an entity was not detached `PERMANENTLY`, on the next server launch the server will read the metadata and recall the table/view/dictionary/database again. If an entity was detached `PERMANENTLY`, there will be no automatic recall.
|
Detaching does not delete the data or metadata of a table, a materialized view, a dictionary or a database. If an entity was not detached `PERMANENTLY`, on the next server launch the server will read the metadata and recall the table/view/dictionary/database again. If an entity was detached `PERMANENTLY`, there will be no automatic recall.
|
||||||
|
|
||||||
Whether a table, a dictionary or a database was detached permanently or not, in both cases you can reattach them using the [ATTACH](../../sql-reference/statements/attach.md) query.
|
Whether a table, a dictionary or a database was detached permanently or not, in both cases you can reattach them using the [ATTACH](../../sql-reference/statements/attach.md) query.
|
||||||
System log tables can be also attached back (e.g. `query_log`, `text_log`, etc). Other system tables can't be reattached. On the next server launch the server will recall those tables again.
|
System log tables can be also attached back (e.g. `query_log`, `text_log`, etc.). Other system tables can't be reattached. On the next server launch the server will recall those tables again.
|
||||||
|
|
||||||
`ATTACH MATERIALIZED VIEW` does not work with short syntax (without `SELECT`), but you can attach it using the `ATTACH TABLE` query.
|
`ATTACH MATERIALIZED VIEW` does not work with short syntax (without `SELECT`), but you can attach it using the `ATTACH TABLE` query.
|
||||||
|
|
||||||
Note that you can not detach permanently the table which is already detached (temporary). But you can attach it back and then detach permanently again.
|
Note that you can not detach permanently the table which is already detached (temporary). But you can attach it back and then detach permanently again.
|
||||||
|
|
||||||
Also you can not [DROP](../../sql-reference/statements/drop.md#drop-table) the detached table, or [CREATE TABLE](../../sql-reference/statements/create/table.md) with the same name as detached permanently, or replace it with the other table with [RENAME TABLE](../../sql-reference/statements/rename.md) query.
|
Also, you can not [DROP](../../sql-reference/statements/drop.md#drop-table) the detached table, or [CREATE TABLE](../../sql-reference/statements/create/table.md) with the same name as detached permanently, or replace it with the other table with [RENAME TABLE](../../sql-reference/statements/rename.md) query.
|
||||||
|
|
||||||
The `SYNC` modifier executes the action without delay.
|
The `SYNC` modifier executes the action without delay.
|
||||||
|
|
||||||
|
@ -5,7 +5,7 @@ sidebar_label: DISTINCT
|
|||||||
|
|
||||||
# DISTINCT Clause
|
# DISTINCT Clause
|
||||||
|
|
||||||
If `SELECT DISTINCT` is specified, only unique rows will remain in a query result. Thus only a single row will remain out of all the sets of fully matching rows in the result.
|
If `SELECT DISTINCT` is specified, only unique rows will remain in a query result. Thus, only a single row will remain out of all the sets of fully matching rows in the result.
|
||||||
|
|
||||||
You can specify the list of columns that must have unique values: `SELECT DISTINCT ON (column1, column2,...)`. If the columns are not specified, all of them are taken into consideration.
|
You can specify the list of columns that must have unique values: `SELECT DISTINCT ON (column1, column2,...)`. If the columns are not specified, all of them are taken into consideration.
|
||||||
|
|
||||||
|
@ -63,7 +63,7 @@ ClickHouse — полноценная столбцовая СУБД. Данны
|
|||||||
|
|
||||||
Для байт-ориентированного ввода-вывода существуют абстрактные классы `ReadBuffer` и `WriteBuffer`. Они используются вместо `iostream`. Не волнуйтесь: каждый зрелый проект C++ использует что-то другое вместо `iostream` по уважительным причинам.
|
Для байт-ориентированного ввода-вывода существуют абстрактные классы `ReadBuffer` и `WriteBuffer`. Они используются вместо `iostream`. Не волнуйтесь: каждый зрелый проект C++ использует что-то другое вместо `iostream` по уважительным причинам.
|
||||||
|
|
||||||
`ReadBuffer` и `WriteBuffer` — это просто непрерывный буфер и курсор, указывающий на позицию в этом буфере. Реализации могут как владеть так и не владеть памятью буфера. Существует виртуальный метод заполнения буфера следующими данными (для `ReadBuffer`) или сброса буфера куда-нибудь (например `WriteBuffer`). Виртуальные методы редко вызываются.
|
`ReadBuffer` и `WriteBuffer` — это просто непрерывный буфер и курсор, указывающий на позицию в этом буфере. Реализации могут как владеть, так и не владеть памятью буфера. Существует виртуальный метод заполнения буфера следующими данными (для `ReadBuffer`) или сброса буфера куда-нибудь (например `WriteBuffer`). Виртуальные методы редко вызываются.
|
||||||
|
|
||||||
Реализации `ReadBuffer`/`WriteBuffer` используются для работы с файлами и файловыми дескрипторами, а также сетевыми сокетами, для реализации сжатия (`CompressedWriteBuffer` инициализируется вместе с другим `WriteBuffer` и осуществляет сжатие данных перед записью в него), и для других целей – названия `ConcatReadBuffer`, `LimitReadBuffer`, и `HashingWriteBuffer` говорят сами за себя.
|
Реализации `ReadBuffer`/`WriteBuffer` используются для работы с файлами и файловыми дескрипторами, а также сетевыми сокетами, для реализации сжатия (`CompressedWriteBuffer` инициализируется вместе с другим `WriteBuffer` и осуществляет сжатие данных перед записью в него), и для других целей – названия `ConcatReadBuffer`, `LimitReadBuffer`, и `HashingWriteBuffer` говорят сами за себя.
|
||||||
|
|
||||||
|
@ -71,7 +71,7 @@ ClickHouse не работает и не собирается на 32-битны
|
|||||||
Please make sure you have the correct access rights
|
Please make sure you have the correct access rights
|
||||||
and the repository exists.
|
and the repository exists.
|
||||||
|
|
||||||
Как правило это означает, что отсутствуют ssh ключи для соединения с GitHub. Ключи расположены в директории `~/.ssh`. В интерфейсе GitHub, в настройках, необходимо загрузить публичные ключи, чтобы он их понимал.
|
Как правило, это означает, что отсутствуют ssh ключи для соединения с GitHub. Ключи расположены в директории `~/.ssh`. В интерфейсе GitHub, в настройках, необходимо загрузить публичные ключи, чтобы он их понимал.
|
||||||
|
|
||||||
Вы также можете клонировать репозиторий по протоколу https:
|
Вы также можете клонировать репозиторий по протоколу https:
|
||||||
|
|
||||||
@ -199,7 +199,7 @@ sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
|
|||||||
|
|
||||||
В случае успешного запуска, вы увидите прогресс сборки - количество обработанных задач и общее количество задач.
|
В случае успешного запуска, вы увидите прогресс сборки - количество обработанных задач и общее количество задач.
|
||||||
|
|
||||||
В процессе сборки могут появится сообщения `libprotobuf WARNING` про protobuf файлы в библиотеке libhdfs2. Это не имеет значения.
|
В процессе сборки могут появиться сообщения `libprotobuf WARNING` про protobuf файлы в библиотеке libhdfs2. Это не имеет значения.
|
||||||
|
|
||||||
При успешной сборке, вы получите готовый исполняемый файл `ClickHouse/build/programs/clickhouse`:
|
При успешной сборке, вы получите готовый исполняемый файл `ClickHouse/build/programs/clickhouse`:
|
||||||
|
|
||||||
@ -207,7 +207,7 @@ sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
|
|||||||
|
|
||||||
## Запуск собранной версии ClickHouse {#zapusk-sobrannoi-versii-clickhouse}
|
## Запуск собранной версии ClickHouse {#zapusk-sobrannoi-versii-clickhouse}
|
||||||
|
|
||||||
Для запуска сервера из под текущего пользователя, с выводом логов в терминал и с использованием примеров конфигурационных файлов, расположенных в исходниках, перейдите в директорию `ClickHouse/programs/server/` (эта директория находится не в директории build) и выполните:
|
Для запуска сервера из-под текущего пользователя, с выводом логов в терминал и с использованием примеров конфигурационных файлов, расположенных в исходниках, перейдите в директорию `ClickHouse/programs/server/` (эта директория находится не в директории build) и выполните:
|
||||||
|
|
||||||
../../build/programs/clickhouse server
|
../../build/programs/clickhouse server
|
||||||
|
|
||||||
|
@ -37,7 +37,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
|||||||
|
|
||||||
**Секции запроса**
|
**Секции запроса**
|
||||||
|
|
||||||
При создании таблицы с движком `CollapsingMergeTree` используются те же [секции запроса](mergetree.md#table_engine-mergetree-creating-a-table) что и при создании таблицы с движком `MergeTree`.
|
При создании таблицы с движком `CollapsingMergeTree` используются те же [секции запроса](mergetree.md#table_engine-mergetree-creating-a-table), что и при создании таблицы с движком `MergeTree`.
|
||||||
|
|
||||||
<details markdown="1">
|
<details markdown="1">
|
||||||
|
|
||||||
|
@ -42,7 +42,7 @@ CREATE TABLE merge.hits_buffer AS merge.hits ENGINE = Buffer(merge, hits, 16, 10
|
|||||||
В качестве имени базы данных и имени таблицы можно указать пустые строки в одинарных кавычках. Это обозначает отсутствие таблицы назначения. В таком случае, при достижении условий на сброс данных, буфер будет просто очищаться. Это может быть полезным, чтобы хранить в оперативке некоторое окно данных.
|
В качестве имени базы данных и имени таблицы можно указать пустые строки в одинарных кавычках. Это обозначает отсутствие таблицы назначения. В таком случае, при достижении условий на сброс данных, буфер будет просто очищаться. Это может быть полезным, чтобы хранить в оперативке некоторое окно данных.
|
||||||
|
|
||||||
При чтении из таблицы типа Buffer, будут обработаны данные, как находящиеся в буфере, так и данные из таблицы назначения (если такая есть).
|
При чтении из таблицы типа Buffer, будут обработаны данные, как находящиеся в буфере, так и данные из таблицы назначения (если такая есть).
|
||||||
Но следует иметь ввиду, что таблица Buffer не поддерживает индекс. То есть, данные в буфере будут просканированы полностью, что может быть медленно для буферов большого размера. (Для данных в подчинённой таблице, будет использоваться тот индекс, который она поддерживает.)
|
Но следует иметь в виду, что таблица Buffer не поддерживает индекс. То есть, данные в буфере будут просканированы полностью, что может быть медленно для буферов большого размера. (Для данных в подчинённой таблице, будет использоваться тот индекс, который она поддерживает.)
|
||||||
|
|
||||||
Если множество столбцов таблицы Buffer не совпадает с множеством столбцов подчинённой таблицы, то будут вставлено подмножество столбцов, которое присутствует в обеих таблицах.
|
Если множество столбцов таблицы Buffer не совпадает с множеством столбцов подчинённой таблицы, то будут вставлено подмножество столбцов, которое присутствует в обеих таблицах.
|
||||||
|
|
||||||
@ -66,4 +66,4 @@ CREATE TABLE merge.hits_buffer AS merge.hits ENGINE = Buffer(merge, hits, 16, 10
|
|||||||
|
|
||||||
Таблицы типа Buffer используются в тех случаях, когда от большого количества серверов поступает слишком много INSERT-ов в единицу времени, и нет возможности заранее самостоятельно буферизовать данные перед вставкой, в результате чего, INSERT-ы не успевают выполняться.
|
Таблицы типа Buffer используются в тех случаях, когда от большого количества серверов поступает слишком много INSERT-ов в единицу времени, и нет возможности заранее самостоятельно буферизовать данные перед вставкой, в результате чего, INSERT-ы не успевают выполняться.
|
||||||
|
|
||||||
Заметим, что даже для таблиц типа Buffer не имеет смысла вставлять данные по одной строке, так как таким образом будет достигнута скорость всего лишь в несколько тысяч строк в секунду, тогда как при вставке более крупными блоками, достижимо более миллиона строк в секунду (смотрите раздел [«Производительность»](../../../introduction/performance.md).
|
Заметим, что даже для таблиц типа Buffer не имеет смысла вставлять данные по одной строке, так как таким образом будет достигнута скорость всего лишь в несколько тысяч строк в секунду, тогда как при вставке более крупными блоками, достижимо более миллиона строк в секунду (смотрите раздел [«Производительность»](../../../introduction/performance.md)).
|
||||||
|
@ -177,11 +177,11 @@ URI позволяет подключаться к нескольким хост
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
Строка подключения должна быть указана в первом аргументе clickhouse-client. Строка подключения может комбинироваться с другими [параметрами командной строки] (#command-line-options) кроме `--host/-h` и `--port`.
|
Строка подключения должна быть указана в первом аргументе clickhouse-client. Строка подключения может комбинироваться с другими [параметрами командной строки](#command-line-options) кроме `--host/-h` и `--port`.
|
||||||
|
|
||||||
Для компонента `query_parameter` разрешены следующие ключи:
|
Для компонента `query_parameter` разрешены следующие ключи:
|
||||||
|
|
||||||
- `secure` или сокращенно `s` - без значение. Если параметр указан, то соединение с сервером будет осуществляться по защищенному каналу (TLS). См. `secure` в [command-line-options](#command-line-options).
|
- `secure` или сокращенно `s` - без значения. Если параметр указан, то соединение с сервером будет осуществляться по защищенному каналу (TLS). См. `secure` в [command-line-options](#command-line-options).
|
||||||
|
|
||||||
### Кодирование URI {#connection_string_uri_percent_encoding}
|
### Кодирование URI {#connection_string_uri_percent_encoding}
|
||||||
|
|
||||||
@ -206,7 +206,7 @@ clickhouse-client clickhouse://john:secret@127.0.0.1:9000
|
|||||||
clickhouse-client clickhouse://[::1]:9000
|
clickhouse-client clickhouse://[::1]:9000
|
||||||
```
|
```
|
||||||
|
|
||||||
Подключиться к localhost через порт 9000 многострочном режиме.
|
Подключиться к localhost через порт 9000 в многострочном режиме.
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
clickhouse-client clickhouse://localhost:9000 '-m'
|
clickhouse-client clickhouse://localhost:9000 '-m'
|
||||||
|
@ -69,7 +69,7 @@ ClickHouse Keeper может использоваться как равноце
|
|||||||
|
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
В случае изменения топологии кластера ClickHouse Keeper(например, замены сервера), удостоверьтесь, что вы сохраняеете отношение `server_id` - `hostname`, не переиспользуете существующие `server_id` для для новых серверов и не перемешиваете идентификаторы. Подобные ошибки могут случаться, если вы используете автоматизацию при разворачивании кластера без логики сохранения идентификаторов.
|
В случае изменения топологии кластера ClickHouse Keeper(например, замены сервера), удостоверьтесь, что вы сохраняеете отношение `server_id` - `hostname`, не переиспользуете существующие `server_id` для новых серверов и не перемешиваете идентификаторы. Подобные ошибки могут случаться, если вы используете автоматизацию при разворачивании кластера без логики сохранения идентификаторов.
|
||||||
:::
|
:::
|
||||||
|
|
||||||
Примеры конфигурации кворума с тремя узлами можно найти в [интеграционных тестах](https://github.com/ClickHouse/ClickHouse/tree/master/tests/integration) с префиксом `test_keeper_`. Пример конфигурации для сервера №1:
|
Примеры конфигурации кворума с тремя узлами можно найти в [интеграционных тестах](https://github.com/ClickHouse/ClickHouse/tree/master/tests/integration) с префиксом `test_keeper_`. Пример конфигурации для сервера №1:
|
||||||
@ -337,7 +337,7 @@ clickhouse-keeper-converter --zookeeper-logs-dir /var/lib/zookeeper/version-2 --
|
|||||||
|
|
||||||
После того, как выполнили действия выше выполните следующие шаги.
|
После того, как выполнили действия выше выполните следующие шаги.
|
||||||
1. Выберете одну ноду Keeper, которая станет новым лидером. Учтите, что данные с этой ноды будут использованы всем кластером, поэтому рекомендуется выбрать ноду с наиболее актуальным состоянием.
|
1. Выберете одну ноду Keeper, которая станет новым лидером. Учтите, что данные с этой ноды будут использованы всем кластером, поэтому рекомендуется выбрать ноду с наиболее актуальным состоянием.
|
||||||
2. Перед дальнейшими действиям сделайте резервную копию данных из директорий `log_storage_path` и `snapshot_storage_path`.
|
2. Перед дальнейшими действиями сделайте резервную копию данных из директорий `log_storage_path` и `snapshot_storage_path`.
|
||||||
3. Измените настройки на всех нодах кластера, которые вы собираетесь использовать.
|
3. Измените настройки на всех нодах кластера, которые вы собираетесь использовать.
|
||||||
4. Отправьте команду `rcvr` на ноду, которую вы выбрали, или остановите ее и запустите заново с аргументом `--force-recovery`. Это переведет ноду в режим восстановления.
|
4. Отправьте команду `rcvr` на ноду, которую вы выбрали, или остановите ее и запустите заново с аргументом `--force-recovery`. Это переведет ноду в режим восстановления.
|
||||||
5. Запускайте остальные ноды кластера по одной и проверяйте, что команда `mntr` возвращает `follower` в выводе состояния `zk_server_state` перед тем, как запустить следующую ноду.
|
5. Запускайте остальные ноды кластера по одной и проверяйте, что команда `mntr` возвращает `follower` в выводе состояния `zk_server_state` перед тем, как запустить следующую ноду.
|
||||||
|
@ -89,7 +89,7 @@ $ cat /etc/clickhouse-server/users.d/alice.xml
|
|||||||
|
|
||||||
Вы можете использовать симметричное шифрование для зашифровки элемента конфигурации, например, поля password. Чтобы это сделать, сначала настройте [кодек шифрования](../sql-reference/statements/create/table.md#encryption-codecs), затем добавьте аттибут`encrypted_by` с именем кодека шифрования как значение к элементу, который надо зашифровать.
|
Вы можете использовать симметричное шифрование для зашифровки элемента конфигурации, например, поля password. Чтобы это сделать, сначала настройте [кодек шифрования](../sql-reference/statements/create/table.md#encryption-codecs), затем добавьте аттибут`encrypted_by` с именем кодека шифрования как значение к элементу, который надо зашифровать.
|
||||||
|
|
||||||
В отличии от аттрибутов `from_zk`, `from_env` и `incl` (или элемента `include`), подстановка, т.е. расшифровка зашифрованного значения, не выподняется в файле предобработки. Расшифровка происходит только во время исполнения в серверном процессе.
|
В отличие от аттрибутов `from_zk`, `from_env` и `incl` (или элемента `include`), подстановка, т.е. расшифровка зашифрованного значения, не выподняется в файле предобработки. Расшифровка происходит только во время исполнения в серверном процессе.
|
||||||
|
|
||||||
Пример:
|
Пример:
|
||||||
|
|
||||||
@ -110,7 +110,7 @@ $ cat /etc/clickhouse-server/users.d/alice.xml
|
|||||||
</clickhouse>
|
</clickhouse>
|
||||||
```
|
```
|
||||||
|
|
||||||
Чтобы получить зашифрованное значение может быть использовано приложение-пример `encrypt_decrypt` .
|
Чтобы получить зашифрованное значение, может быть использовано приложение-пример `encrypt_decrypt` .
|
||||||
|
|
||||||
Пример:
|
Пример:
|
||||||
|
|
||||||
|
@ -50,7 +50,7 @@ clickhouse-benchmark [keys] < queries_file;
|
|||||||
- `-r`, `--randomize` — использовать случайный порядок выполнения запросов при наличии более одного входного запроса.
|
- `-r`, `--randomize` — использовать случайный порядок выполнения запросов при наличии более одного входного запроса.
|
||||||
- `-s`, `--secure` — используется `TLS` соединение.
|
- `-s`, `--secure` — используется `TLS` соединение.
|
||||||
- `-t N`, `--timelimit=N` — лимит по времени в секундах. `clickhouse-benchmark` перестает отправлять запросы при достижении лимита по времени. Значение по умолчанию: 0 (лимит отключен).
|
- `-t N`, `--timelimit=N` — лимит по времени в секундах. `clickhouse-benchmark` перестает отправлять запросы при достижении лимита по времени. Значение по умолчанию: 0 (лимит отключен).
|
||||||
- `--confidence=N` — уровень доверия для T-критерия. Возможные значения: 0 (80%), 1 (90%), 2 (95%), 3 (98%), 4 (99%), 5 (99.5%). Значение по умолчанию: 5. В [режиме сравнения](#clickhouse-benchmark-comparison-mode) `clickhouse-benchmark` проверяет [двухвыборочный t-критерий Стьюдента для независимых выборок](https://en.wikipedia.org/wiki/Student%27s_t-test#Independent_two-sample_t-test) чтобы определить, различны ли две выборки при выбранном уровне доверия.
|
- `--confidence=N` — уровень доверия для T-критерия. Возможные значения: 0 (80%), 1 (90%), 2 (95%), 3 (98%), 4 (99%), 5 (99.5%). Значение по умолчанию: 5. В [режиме сравнения](#clickhouse-benchmark-comparison-mode) `clickhouse-benchmark` проверяет [двухвыборочный t-критерий Стьюдента для независимых выборок](https://en.wikipedia.org/wiki/Student%27s_t-test#Independent_two-sample_t-test), чтобы определить, различны ли две выборки при выбранном уровне доверия.
|
||||||
- `--cumulative` — выводить статистику за все время работы, а не за последний временной интервал.
|
- `--cumulative` — выводить статистику за все время работы, а не за последний временной интервал.
|
||||||
- `--database=DATABASE_NAME` — имя базы данных ClickHouse. Значение по умолчанию: `default`.
|
- `--database=DATABASE_NAME` — имя базы данных ClickHouse. Значение по умолчанию: `default`.
|
||||||
- `--json=FILEPATH` — дополнительный вывод в формате `JSON`. Когда этот ключ указан, `clickhouse-benchmark` выводит отчет в указанный JSON-файл.
|
- `--json=FILEPATH` — дополнительный вывод в формате `JSON`. Когда этот ключ указан, `clickhouse-benchmark` выводит отчет в указанный JSON-файл.
|
||||||
|
@ -33,7 +33,7 @@ ClickHouse отображает значения в зависимости от
|
|||||||
|
|
||||||
## Примеры {#primery}
|
## Примеры {#primery}
|
||||||
|
|
||||||
**1.** Создание таблицы с столбцом типа `DateTime` и вставка данных в неё:
|
**1.** Создание таблицы со столбцом типа `DateTime` и вставка данных в неё:
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
CREATE TABLE dt
|
CREATE TABLE dt
|
||||||
|
@ -172,7 +172,7 @@ multiplyDecimal(a, b[, result_scale])
|
|||||||
```
|
```
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
Эта функция работают гораздо медленнее обычной `multiply`.
|
Эта функция работает гораздо медленнее обычной `multiply`.
|
||||||
В случае, если нет необходимости иметь фиксированную точность и/или нужны быстрые вычисления, следует использовать [multiply](#multiply).
|
В случае, если нет необходимости иметь фиксированную точность и/или нужны быстрые вычисления, следует использовать [multiply](#multiply).
|
||||||
:::
|
:::
|
||||||
|
|
||||||
|
@ -488,7 +488,7 @@ arrayPushBack(array, single_value)
|
|||||||
**Аргументы**
|
**Аргументы**
|
||||||
|
|
||||||
- `array` – массив.
|
- `array` – массив.
|
||||||
- `single_value` – значение добавляемого элемента. В массив с числам можно добавить только числа, в массив со строками только строки. При добавлении чисел ClickHouse автоматически приводит тип `single_value` к типу данных массива. Подробнее о типах данных в ClickHouse читайте в разделе «[Типы данных](../../sql-reference/functions/array-functions.md#data_types)». Может быть равно `NULL`, в этом случае функция добавит элемент `NULL` в массив, а тип элементов массива преобразует в `Nullable`.
|
- `single_value` – значение добавляемого элемента. В массив с числами можно добавить только числа, в массив со строками только строки. При добавлении чисел ClickHouse автоматически приводит тип `single_value` к типу данных массива. Подробнее о типах данных в ClickHouse читайте в разделе «[Типы данных](../../sql-reference/functions/array-functions.md#data_types)». Может быть равно `NULL`, в этом случае функция добавит элемент `NULL` в массив, а тип элементов массива преобразует в `Nullable`.
|
||||||
|
|
||||||
**Пример**
|
**Пример**
|
||||||
|
|
||||||
@ -513,7 +513,7 @@ arrayPushFront(array, single_value)
|
|||||||
**Аргументы**
|
**Аргументы**
|
||||||
|
|
||||||
- `array` – массив.
|
- `array` – массив.
|
||||||
- `single_value` – значение добавляемого элемента. В массив с числам можно добавить только числа, в массив со строками только строки. При добавлении чисел ClickHouse автоматически приводит тип `single_value` к типу данных массива. Подробнее о типах данных в ClickHouse читайте в разделе «[Типы данных](../../sql-reference/functions/array-functions.md#data_types)». Может быть равно `NULL`, в этом случае функция добавит элемент `NULL` в массив, а тип элементов массива преобразует в `Nullable`.
|
- `single_value` – значение добавляемого элемента. В массив с числами можно добавить только числа, в массив со строками только строки. При добавлении чисел ClickHouse автоматически приводит тип `single_value` к типу данных массива. Подробнее о типах данных в ClickHouse читайте в разделе «[Типы данных](../../sql-reference/functions/array-functions.md#data_types)». Может быть равно `NULL`, в этом случае функция добавит элемент `NULL` в массив, а тип элементов массива преобразует в `Nullable`.
|
||||||
|
|
||||||
**Пример**
|
**Пример**
|
||||||
|
|
||||||
|
@ -92,7 +92,7 @@ ClickHouse поддерживает использование секций `DIS
|
|||||||
|
|
||||||
## Обработка NULL {#null-processing}
|
## Обработка NULL {#null-processing}
|
||||||
|
|
||||||
`DISTINCT` работает с [NULL](../../syntax.md#null-literal) как-будто `NULL` — обычное значение и `NULL==NULL`. Другими словами, в результате `DISTINCT`, различные комбинации с `NULL` встретятся только один раз. Это отличается от обработки `NULL` в большинстве других контекстов.
|
`DISTINCT` работает с [NULL](../../syntax.md#null-literal) как будто `NULL` — обычное значение и `NULL==NULL`. Другими словами, в результате `DISTINCT`, различные комбинации с `NULL` встретятся только один раз. Это отличается от обработки `NULL` в большинстве других контекстов.
|
||||||
|
|
||||||
## Альтернативы {#alternatives}
|
## Альтернативы {#alternatives}
|
||||||
|
|
||||||
|
@ -33,7 +33,7 @@ clusterAllReplicas('cluster_name', db, table[, sharding_key])
|
|||||||
|
|
||||||
**Использование макросов**
|
**Использование макросов**
|
||||||
|
|
||||||
`cluster_name` может содержать макрос — подстановку в фигурных скобках. Эта подстановка заменяется на соответствующее значение из секции [macros](../../operations/server-configuration-parameters/settings.md#macros) конфигурационного файла .
|
`cluster_name` может содержать макрос — подстановку в фигурных скобках. Эта подстановка заменяется на соответствующее значение из секции [macros](../../operations/server-configuration-parameters/settings.md#macros) конфигурационного файла.
|
||||||
|
|
||||||
Пример:
|
Пример:
|
||||||
|
|
||||||
|
@ -20,7 +20,7 @@ machine_translated_rev: 5decc73b5dc60054f19087d3690c4eb99446a6c3
|
|||||||
- `LOADED_AND_RELOADING` — Dictionary is loaded successfully, and is being reloaded right now (frequent reasons: [SYSTEM RELOAD DICTIONARY](../../sql-reference/statements/system.md#query_language-system-reload-dictionary) 查询,超时,字典配置已更改)。
|
- `LOADED_AND_RELOADING` — Dictionary is loaded successfully, and is being reloaded right now (frequent reasons: [SYSTEM RELOAD DICTIONARY](../../sql-reference/statements/system.md#query_language-system-reload-dictionary) 查询,超时,字典配置已更改)。
|
||||||
- `FAILED_AND_RELOADING` — Could not load the dictionary as a result of an error and is loading now.
|
- `FAILED_AND_RELOADING` — Could not load the dictionary as a result of an error and is loading now.
|
||||||
- `origin` ([字符串](../../sql-reference/data-types/string.md)) — Path to the configuration file that describes the dictionary.
|
- `origin` ([字符串](../../sql-reference/data-types/string.md)) — Path to the configuration file that describes the dictionary.
|
||||||
- `type` ([字符串](../../sql-reference/data-types/string.md)) — Type of a dictionary allocation. [在内存中存储字典](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-layout.md).
|
- `type` ([字符串](../../sql-reference/data-types/string.md)) — Type of dictionary allocation. [在内存中存储字典](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-layout.md).
|
||||||
- `key` — [密钥类型](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md#ext_dict_structure-key):数字键 ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) or Сomposite key ([字符串](../../sql-reference/data-types/string.md)) — form “(type 1, type 2, …, type n)”.
|
- `key` — [密钥类型](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md#ext_dict_structure-key):数字键 ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) or Сomposite key ([字符串](../../sql-reference/data-types/string.md)) — form “(type 1, type 2, …, type n)”.
|
||||||
- `attribute.names` ([阵列](../../sql-reference/data-types/array.md)([字符串](../../sql-reference/data-types/string.md))) — Array of [属性名称](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md#ext_dict_structure-attributes) 由字典提供。
|
- `attribute.names` ([阵列](../../sql-reference/data-types/array.md)([字符串](../../sql-reference/data-types/string.md))) — Array of [属性名称](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md#ext_dict_structure-attributes) 由字典提供。
|
||||||
- `attribute.types` ([阵列](../../sql-reference/data-types/array.md)([字符串](../../sql-reference/data-types/string.md))) — Corresponding array of [属性类型](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md#ext_dict_structure-attributes) 这是由字典提供。
|
- `attribute.types` ([阵列](../../sql-reference/data-types/array.md)([字符串](../../sql-reference/data-types/string.md))) — Corresponding array of [属性类型](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md#ext_dict_structure-attributes) 这是由字典提供。
|
||||||
|
Loading…
Reference in New Issue
Block a user