2020-03-20 18:20:59 +00:00
|
|
|
|
# AggregatingMergeTree {#aggregatingmergetree}
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
该引擎继承自 [MergeTree](mergetree.md),并改变了数据片段的合并逻辑。 ClickHouse 会将一个数据片段内所有具有相同主键(准确的说是 [排序键](../../../engines/table-engines/mergetree-family/mergetree.md))的行替换成一行,这一行会存储一系列聚合函数的状态。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
可以使用 `AggregatingMergeTree` 表来做增量数据的聚合统计,包括物化视图的数据聚合。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
引擎使用以下类型来处理所有列:
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
- [AggregateFunction](../../../sql-reference/data-types/aggregatefunction.md)
|
|
|
|
|
- [SimpleAggregateFunction](../../../sql-reference/data-types/simpleaggregatefunction.md)
|
|
|
|
|
|
|
|
|
|
`AggregatingMergeTree` 适用于能够按照一定的规则缩减行数的情况。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-03-20 18:20:59 +00:00
|
|
|
|
## 建表 {#jian-biao}
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
|
|
|
|
``` sql
|
|
|
|
|
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
|
|
|
|
(
|
|
|
|
|
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
|
|
|
|
|
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
|
|
|
|
|
...
|
|
|
|
|
) ENGINE = AggregatingMergeTree()
|
|
|
|
|
[PARTITION BY expr]
|
|
|
|
|
[ORDER BY expr]
|
|
|
|
|
[SAMPLE BY expr]
|
2020-09-09 14:45:49 +00:00
|
|
|
|
[TTL expr]
|
2018-11-30 19:26:35 +00:00
|
|
|
|
[SETTINGS name=value, ...]
|
|
|
|
|
```
|
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
语句参数的说明,请参阅 [建表语句描述](../../../sql-reference/statements/create.md#create-table-query)。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2019-05-20 02:49:08 +00:00
|
|
|
|
**子句**
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2019-05-20 02:49:08 +00:00
|
|
|
|
创建 `AggregatingMergeTree` 表时,需用跟创建 `MergeTree` 表一样的[子句](mergetree.md)。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-03-20 18:20:59 +00:00
|
|
|
|
<details markdown="1">
|
|
|
|
|
|
|
|
|
|
<summary>已弃用的建表方法</summary>
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
!!! attention "注意"
|
2019-05-20 02:49:08 +00:00
|
|
|
|
不要在新项目中使用该方法,可能的话,请将旧项目切换到上述方法。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-03-20 18:20:59 +00:00
|
|
|
|
``` sql
|
2018-11-30 19:26:35 +00:00
|
|
|
|
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
|
|
|
|
(
|
|
|
|
|
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
|
|
|
|
|
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
|
|
|
|
|
...
|
|
|
|
|
) ENGINE [=] AggregatingMergeTree(date-column [, sampling_expression], (primary, key), index_granularity)
|
|
|
|
|
```
|
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
上面的所有参数的含义跟 `MergeTree` 中的一样。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
</details>
|
|
|
|
|
|
2020-03-20 18:20:59 +00:00
|
|
|
|
## SELECT 和 INSERT {#select-he-insert}
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
要插入数据,需使用带有 -State- 聚合函数的 [INSERT SELECT](../../../sql-reference/statements/insert-into.md) 语句。
|
2019-05-20 02:49:08 +00:00
|
|
|
|
从 `AggregatingMergeTree` 表中查询数据时,需使用 `GROUP BY` 子句并且要使用与插入时相同的聚合函数,但后缀要改为 `-Merge` 。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-09-09 14:45:49 +00:00
|
|
|
|
对于 `SELECT` 查询的结果, `AggregateFunction` 类型的值对 ClickHouse 的所有输出格式都实现了特定的二进制表示法。在进行数据转储时,例如使用 `TabSeparated` 格式进行 `SELECT` 查询,那么这些转储数据也能直接用 `INSERT` 语句导回。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2020-03-20 18:20:59 +00:00
|
|
|
|
## 聚合物化视图的示例 {#ju-he-wu-hua-shi-tu-de-shi-li}
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2019-05-20 02:49:08 +00:00
|
|
|
|
创建一个跟踪 `test.visits` 表的 `AggregatingMergeTree` 物化视图:
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
|
|
|
|
``` sql
|
|
|
|
|
CREATE MATERIALIZED VIEW test.basic
|
|
|
|
|
ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate)
|
|
|
|
|
AS SELECT
|
|
|
|
|
CounterID,
|
|
|
|
|
StartDate,
|
|
|
|
|
sumState(Sign) AS Visits,
|
|
|
|
|
uniqState(UserID) AS Users
|
|
|
|
|
FROM test.visits
|
|
|
|
|
GROUP BY CounterID, StartDate;
|
|
|
|
|
```
|
|
|
|
|
|
2019-05-20 02:49:08 +00:00
|
|
|
|
向 `test.visits` 表中插入数据。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
|
|
|
|
``` sql
|
|
|
|
|
INSERT INTO test.visits ...
|
|
|
|
|
```
|
|
|
|
|
|
2019-05-20 02:49:08 +00:00
|
|
|
|
数据会同时插入到表和视图中,并且视图 `test.basic` 会将里面的数据聚合。
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
2019-05-20 02:49:08 +00:00
|
|
|
|
要获取聚合数据,我们需要在 `test.basic` 视图上执行类似 `SELECT ... GROUP BY ...` 这样的查询 :
|
2018-11-30 19:26:35 +00:00
|
|
|
|
|
|
|
|
|
``` sql
|
|
|
|
|
SELECT
|
|
|
|
|
StartDate,
|
|
|
|
|
sumMerge(Visits) AS Visits,
|
|
|
|
|
uniqMerge(Users) AS Users
|
|
|
|
|
FROM test.basic
|
|
|
|
|
GROUP BY StartDate
|
|
|
|
|
ORDER BY StartDate;
|
|
|
|
|
```
|
|
|
|
|
|
2020-01-30 10:34:55 +00:00
|
|
|
|
[来源文章](https://clickhouse.tech/docs/en/operations/table_engines/aggregatingmergetree/) <!--hide-->
|