Merge branch 'master' of github.com:ClickHouse/ClickHouse

This commit is contained in:
Sergei Shtykov 2019-11-26 17:46:14 +03:00
commit f9a83852ee
7 changed files with 106 additions and 6 deletions

View File

@ -5,13 +5,11 @@
### New Feature ### New Feature
* Add the ability to create dictionaries with DDL queries. [#7360](https://github.com/ClickHouse/ClickHouse/pull/7360) ([alesapin](https://github.com/alesapin)) * Add the ability to create dictionaries with DDL queries. [#7360](https://github.com/ClickHouse/ClickHouse/pull/7360) ([alesapin](https://github.com/alesapin))
* Authentication in S3 table function and storage. Now we have complete support for S3 import/export. [#7623](https://github.com/ClickHouse/ClickHouse/pull/7623) ([Vladimir Chebotarev](https://github.com/excitoon))
* Make `bloom_filter` type of index supporting `LowCardinality` and `Nullable` [#7363](https://github.com/ClickHouse/ClickHouse/issues/7363) [#7561](https://github.com/ClickHouse/ClickHouse/pull/7561) ([Nikolai Kochetov](https://github.com/KochetovNicolai)) * Make `bloom_filter` type of index supporting `LowCardinality` and `Nullable` [#7363](https://github.com/ClickHouse/ClickHouse/issues/7363) [#7561](https://github.com/ClickHouse/ClickHouse/pull/7561) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Add function `isValidJSON` to check that passed string is a valid json. [#5910](https://github.com/ClickHouse/ClickHouse/issues/5910) [#7293](https://github.com/ClickHouse/ClickHouse/pull/7293) ([Vdimir](https://github.com/Vdimir)) * Add function `isValidJSON` to check that passed string is a valid json. [#5910](https://github.com/ClickHouse/ClickHouse/issues/5910) [#7293](https://github.com/ClickHouse/ClickHouse/pull/7293) ([Vdimir](https://github.com/Vdimir))
* Implement `arrayCompact` function [#7328](https://github.com/ClickHouse/ClickHouse/pull/7328) ([Memo](https://github.com/Joeywzr)) * Implement `arrayCompact` function [#7328](https://github.com/ClickHouse/ClickHouse/pull/7328) ([Memo](https://github.com/Joeywzr))
* Created function `hex` for Decimal numbers. It works like `hex(reinterpretAsString())`, but doesn't delete last zero bytes. [#7355](https://github.com/ClickHouse/ClickHouse/pull/7355) ([Mikhail Korotov](https://github.com/millb)) * Created function `hex` for Decimal numbers. It works like `hex(reinterpretAsString())`, but doesn't delete last zero bytes. [#7355](https://github.com/ClickHouse/ClickHouse/pull/7355) ([Mikhail Korotov](https://github.com/millb))
* Add `arrayFill` and `arrayReverseFill` functions, which replace elements by other elements in front/back of them in the array. [#7380](https://github.com/ClickHouse/ClickHouse/pull/7380) ([hcz](https://github.com/hczhcz)) * Add `arrayFill` and `arrayReverseFill` functions, which replace elements by other elements in front/back of them in the array. [#7380](https://github.com/ClickHouse/ClickHouse/pull/7380) ([hcz](https://github.com/hczhcz))
* Up precision of `avg` aggregate function result to max of `Decimal` type [#7446](https://github.com/ClickHouse/ClickHouse/pull/7446) ([Andrey Konyaev](https://github.com/akonyaev90))
* Add `CRC32IEEE()`/`CRC64()` support [#7480](https://github.com/ClickHouse/ClickHouse/pull/7480) ([Azat Khuzhin](https://github.com/azat)) * Add `CRC32IEEE()`/`CRC64()` support [#7480](https://github.com/ClickHouse/ClickHouse/pull/7480) ([Azat Khuzhin](https://github.com/azat))
* Implement `char` function similar to one in [mysql](https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_char) [#7486](https://github.com/ClickHouse/ClickHouse/pull/7486) ([sundyli](https://github.com/sundy-li)) * Implement `char` function similar to one in [mysql](https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_char) [#7486](https://github.com/ClickHouse/ClickHouse/pull/7486) ([sundyli](https://github.com/sundy-li))
* Add `bitmapTransform` function. It transforms an array of values in a bitmap to another array of values, the result is a new bitmap [#7598](https://github.com/ClickHouse/ClickHouse/pull/7598) ([Zhichang Yu](https://github.com/yuzhichang)) * Add `bitmapTransform` function. It transforms an array of values in a bitmap to another array of values, the result is a new bitmap [#7598](https://github.com/ClickHouse/ClickHouse/pull/7598) ([Zhichang Yu](https://github.com/yuzhichang))

View File

@ -261,7 +261,7 @@ std::string getExceptionMessage(const Exception & e, bool with_stacktrace, bool
stream << "Code: " << e.code() << ", e.displayText() = " << text; stream << "Code: " << e.code() << ", e.displayText() = " << text;
if (with_stacktrace && !has_embedded_stack_trace) if (with_stacktrace && !has_embedded_stack_trace)
stream << ", Stack trace:\n\n" << e.getStackTrace().toString(); stream << ", Stack trace (when copying this message, always include the lines below):\n\n" << e.getStackTrace().toString();
} }
catch (...) {} catch (...) {}

View File

@ -143,7 +143,7 @@ static void logException(Context & context, QueryLogElement & elem)
LOG_ERROR(&Logger::get("executeQuery"), elem.exception LOG_ERROR(&Logger::get("executeQuery"), elem.exception
<< " (from " << context.getClientInfo().current_address.toString() << ")" << " (from " << context.getClientInfo().current_address.toString() << ")"
<< " (in query: " << joinLines(elem.query) << ")" << " (in query: " << joinLines(elem.query) << ")"
<< (!elem.stack_trace.empty() ? ", Stack trace:\n\n" + elem.stack_trace : "")); << (!elem.stack_trace.empty() ? ", Stack trace (when copying this message, always include the lines below):\n\n" + elem.stack_trace : ""));
} }

View File

@ -3,5 +3,6 @@
138768 138768
-2143570108 -2143570108
2145564783 2145564783
1258255525
96354 96354
1470786104 1470786104

View File

@ -3,5 +3,6 @@ select javaHash('874293087');
select javaHashUTF16LE(convertCharset('a1가', 'utf-8', 'utf-16le')); select javaHashUTF16LE(convertCharset('a1가', 'utf-8', 'utf-16le'));
select javaHashUTF16LE(convertCharset('가나다라마바사아자차카타파하', 'utf-8', 'utf-16le')); select javaHashUTF16LE(convertCharset('가나다라마바사아자차카타파하', 'utf-8', 'utf-16le'));
select javaHashUTF16LE(convertCharset('FJKLDSJFIOLD_389159837589429', 'utf-8', 'utf-16le')); select javaHashUTF16LE(convertCharset('FJKLDSJFIOLD_389159837589429', 'utf-8', 'utf-16le'));
select javaHashUTF16LE(convertCharset('𐐀𐐁𐐂𐐃𐐄', 'utf-8', 'utf-16le'));
select hiveHash('abc'); select hiveHash('abc');
select hiveHash('874293087'); select hiveHash('874293087');

View File

@ -3,7 +3,7 @@
ClickHouse提供了两个网络接口两者都可以选择包装在TLS中以提高安全性 ClickHouse提供了两个网络接口两者都可以选择包装在TLS中以提高安全性
* [HTTP](http.md),记录在案,易于使用. * [HTTP](http.md),记录在案,易于使用.
* [本地TCP](tcp.md),这有较少的开销. * [本地TCP](tcp.md),这有较少的开销.
在大多数情况下,建议使用适当的工具或库,而不是直接与这些工具或库进行交互。 Yandex的官方支持如下 在大多数情况下,建议使用适当的工具或库,而不是直接与这些工具或库进行交互。 Yandex的官方支持如下
* [命令行客户端](cli.md) * [命令行客户端](cli.md)

View File

@ -70,8 +70,14 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
- `SETTINGS` — 影响 `MergeTree` 性能的额外参数: - `SETTINGS` — 影响 `MergeTree` 性能的额外参数:
- `index_granularity` — 索引粒度。即索引中相邻『标记』间的数据行数。默认值8192 。该列表中所有可用的参数可以从这里查看 [MergeTreeSettings.h](https://github.com/ClickHouse/ClickHouse/blob/master/dbms/src/Storages/MergeTree/MergeTreeSettings.h) 。 - `index_granularity` — 索引粒度。即索引中相邻『标记』间的数据行数。默认值8192 。该列表中所有可用的参数可以从这里查看 [MergeTreeSettings.h](https://github.com/ClickHouse/ClickHouse/blob/master/dbms/src/Storages/MergeTree/MergeTreeSettings.h) 。
- `index_granularity_bytes` — 索引粒度,以字节为单位,默认值: 10Mb。如果仅按数据行数限制索引粒度, 请设置为0(不建议)。
- `enable_mixed_granularity_parts` — 启用或禁用通过 `index_granularity_bytes` 控制索引粒度的大小。在19.11版本之前, 只有 `index_granularity` 配置能够用于限制索引粒度的大小。当从大表(数十或数百兆)中查询数据时候,`index_granularity_bytes` 配置能够提升ClickHouse的性能。如果你的表内数据量很大可以开启这项配置用以提升`SELECT` 查询的性能。
- `use_minimalistic_part_header_in_zookeeper` — 数据片段头在 ZooKeeper 中的存储方式。如果设置了 `use_minimalistic_part_header_in_zookeeper=1` ZooKeeper 会存储更少的数据。更多信息参考『服务配置参数』这章中的 [设置描述](../server_settings/settings.md#server-settings-use_minimalistic_part_header_in_zookeeper) 。 - `use_minimalistic_part_header_in_zookeeper` — 数据片段头在 ZooKeeper 中的存储方式。如果设置了 `use_minimalistic_part_header_in_zookeeper=1` ZooKeeper 会存储更少的数据。更多信息参考『服务配置参数』这章中的 [设置描述](../server_settings/settings.md#server-settings-use_minimalistic_part_header_in_zookeeper) 。
- `min_merge_bytes_to_use_direct_io` — 使用直接 I/O 来操作磁盘的合并操作时要求的最小数据量。合并数据片段时ClickHouse 会计算要被合并的所有数据的总存储空间。如果大小超过了 `min_merge_bytes_to_use_direct_io` 设置的字节数,则 ClickHouse 将使用直接 I/O 接口(`O_DIRECT` 选项)对磁盘读写。如果设置 `min_merge_bytes_to_use_direct_io = 0` ,则会禁用直接 I/O。默认值`10 * 1024 * 1024 * 1024` 字节。 - `min_merge_bytes_to_use_direct_io` — 使用直接 I/O 来操作磁盘的合并操作时要求的最小数据量。合并数据片段时ClickHouse 会计算要被合并的所有数据的总存储空间。如果大小超过了 `min_merge_bytes_to_use_direct_io` 设置的字节数,则 ClickHouse 将使用直接 I/O 接口(`O_DIRECT` 选项)对磁盘读写。如果设置 `min_merge_bytes_to_use_direct_io = 0` ,则会禁用直接 I/O。默认值`10 * 1024 * 1024 * 1024` 字节。
<a name="mergetree_setting-merge_with_ttl_timeout"></a>
- `merge_with_ttl_timeout` — TTL合并频率的最小间隔时间。默认值: 86400 (1 天)。
- `write_final_mark` — 启用或禁用在数据片段尾部写入最终索引标记。默认值: 1不建议更改
- `storage_policy` — 存储策略。 参见 [使用多个区块装置进行数据存储](#table_engine-mergetree-multiple-volumes).
**示例配置** **示例配置**
@ -115,7 +121,7 @@ MergeTree(EventDate, intHash32(UserID), (CounterID, EventDate, intHash32(UserID)
对于主要的配置方法,这里 `MergeTree` 引擎跟前面的例子一样,可以以同样的方式配置。 对于主要的配置方法,这里 `MergeTree` 引擎跟前面的例子一样,可以以同样的方式配置。
</details> </details>
## 数据存储 ## 数据存储 {#mergetree-data-storage}
表由按主键排序的数据 *片段* 组成。 表由按主键排序的数据 *片段* 组成。
@ -296,6 +302,100 @@ INDEX sample_index3 (lower(str), str) TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY
对表的读操作是自动并行的。 对表的读操作是自动并行的。
## 列和表的TTL {#table_engine-mergetree-ttl}
TTL可以设置值的生命周期它既可以为整张表设置也可以为每个列字段单独设置。如果`TTL`同时作用于表和字段ClickHouse会使用先到期的那个。
被设置TTL的表必须拥有[Date](../../data_types/date.md) 或 [DateTime](../../data_types/datetime.md) 类型的字段。要定义数据的生命周期,需要在这个日期字段上使用操作符,例如:
```sql
TTL time_column
TTL time_column + interval
```
要定义`interval`, 需要使用 [time interval](../../query_language/operators.md#operators-datetime) 操作符。
```sql
TTL date_time + INTERVAL 1 MONTH
TTL date_time + INTERVAL 15 HOUR
```
**列字段 TTL**
当列字段中的值过期时, ClickHouse会将它们替换成数据类型的默认值。如果分区内某一列的所有值均已过期则ClickHouse会从文件系统中删除这个分区目录下的列文件。
`TTL`子句不能被用于主键字段。
示例说明:
创建一张包含 `TTL` 的表
```sql
CREATE TABLE example_table
(
d DateTime,
a Int TTL d + INTERVAL 1 MONTH,
b Int TTL d + INTERVAL 1 MONTH,
c String
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(d)
ORDER BY d;
```
为表中已存在的列字段添加 `TTL`
```sql
ALTER TABLE example_table
MODIFY COLUMN
c String TTL d + INTERVAL 1 DAY;
```
修改列字段的 `TTL`
```sql
ALTER TABLE example_table
MODIFY COLUMN
c String TTL d + INTERVAL 1 MONTH;
```
**表 TTL**
当表内的数据过期时, ClickHouse会删除所有对应的行。
举例说明:
创建一张包含 `TTL` 的表
```sql
CREATE TABLE example_table
(
d DateTime,
a Int
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(d)
ORDER BY d
TTL d + INTERVAL 1 MONTH;
```
修改表的 `TTL`
```sql
ALTER TABLE example_table
MODIFY TTL d + INTERVAL 1 DAY;
```
**删除数据**
当ClickHouse合并数据分区时, 会删除TTL过期的数据。
当ClickHouse发现数据过期时, 它将会执行一个计划外的合并。要控制这类合并的频率, 你可以设置 [merge_with_ttl_timeout](#mergetree_setting-merge_with_ttl_timeout)。如果该值被设置的太低, 它将导致执行许多的计划外合并,这可能会消耗大量资源。
如果在合并的时候执行`SELECT` 查询, 则可能会得到过期的数据。为了避免这种情况,可以在`SELECT`之前使用 [OPTIMIZE](../../query_language/misc.md#misc_operations-optimize) 查询。
## Using Multiple Block Devices for Data Storage {#table_engine-mergetree-multiple-volumes} ## Using Multiple Block Devices for Data Storage {#table_engine-mergetree-multiple-volumes}
### Configuration {#table_engine-mergetree-multiple-volumes_configure} ### Configuration {#table_engine-mergetree-multiple-volumes_configure}