2020-04-03 13:23:32 +00:00
---
2022-08-28 14:53:34 +00:00
slug: /en/engines/table-engines/mergetree-family/aggregatingmergetree
2022-04-09 13:29:05 +00:00
sidebar_position: 60
sidebar_label: AggregatingMergeTree
2020-04-03 13:23:32 +00:00
---
2022-06-02 10:55:18 +00:00
# AggregatingMergeTree
2017-12-28 15:13:23 +00:00
2020-06-18 08:24:31 +00:00
The engine inherits from [MergeTree ](../../../engines/table-engines/mergetree-family/mergetree.md#table_engines-mergetree ), altering the logic for data parts merging. ClickHouse replaces all rows with the same primary key (or more accurately, with the same [sorting key ](../../../engines/table-engines/mergetree-family/mergetree.md )) with a single row (within a one data part) that stores a combination of states of aggregate functions.
2018-10-19 11:25:22 +00:00
You can use `AggregatingMergeTree` tables for incremental data aggregation, including for aggregated materialized views.
2017-12-28 15:13:23 +00:00
2020-04-01 23:42:21 +00:00
The engine processes all columns with the following types:
2022-06-24 15:13:15 +00:00
## [AggregateFunction](../../../sql-reference/data-types/aggregatefunction.md)
## [SimpleAggregateFunction](../../../sql-reference/data-types/simpleaggregatefunction.md)
2018-03-25 02:04:22 +00:00
2018-10-19 11:25:22 +00:00
It is appropriate to use `AggregatingMergeTree` if it reduces the number of rows by orders.
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
## Creating a Table {#creating-a-table}
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
``` sql
2018-10-19 11:25:22 +00:00
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
2017-12-28 15:13:23 +00:00
(
2018-10-19 11:25:22 +00:00
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
...
) ENGINE = AggregatingMergeTree()
[PARTITION BY expr]
[ORDER BY expr]
[SAMPLE BY expr]
2019-08-01 20:33:26 +00:00
[TTL expr]
2018-10-19 11:25:22 +00:00
[SETTINGS name=value, ...]
2017-12-28 15:13:23 +00:00
```
2020-07-09 15:10:35 +00:00
For a description of request parameters, see [request description ](../../../sql-reference/statements/create/table.md ).
2017-12-28 15:13:23 +00:00
2018-10-19 11:25:22 +00:00
**Query clauses**
2018-03-25 02:04:22 +00:00
2024-10-24 18:59:03 +00:00
When creating an `AggregatingMergeTree` table, the same [clauses ](../../../engines/table-engines/mergetree-family/mergetree.md ) are required as when creating a `MergeTree` table.
2018-03-25 02:04:22 +00:00
2020-03-20 10:10:48 +00:00
< details markdown = "1" >
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
< summary > Deprecated Method for Creating a Table< / summary >
2023-03-27 18:54:05 +00:00
:::note
2022-04-09 13:29:05 +00:00
Do not use this method in new projects and, if possible, switch the old projects to the method described above.
:::
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
``` sql
2018-10-19 11:25:22 +00:00
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
...
) ENGINE [=] AggregatingMergeTree(date-column [, sampling_expression], (primary, key), index_granularity)
2017-12-28 15:13:23 +00:00
```
2018-10-19 11:25:22 +00:00
All of the parameters have the same meaning as in `MergeTree` .
< / details >
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
## SELECT and INSERT {#select-and-insert}
2017-12-28 15:13:23 +00:00
2020-04-30 18:19:18 +00:00
To insert data, use [INSERT SELECT ](../../../sql-reference/statements/insert-into.md ) query with aggregate -State- functions.
2024-10-24 18:59:03 +00:00
When selecting data from `AggregatingMergeTree` table, use `GROUP BY` clause and the same aggregate functions as when inserting data, but using the `-Merge` suffix.
2017-12-28 15:13:23 +00:00
2024-10-24 18:59:03 +00:00
In the results of `SELECT` query, the values of `AggregateFunction` type have implementation-specific binary representation for all of the ClickHouse output formats. For example, if you dump data into `TabSeparated` format with a `SELECT` query, then this dump can be loaded back using an `INSERT` query.
2018-01-19 14:36:40 +00:00
2020-04-30 18:19:18 +00:00
## Example of an Aggregated Materialized View {#example-of-an-aggregated-materialized-view}
2018-10-19 11:25:22 +00:00
2024-10-24 18:59:03 +00:00
The following example assumes that you have a database named `test` , so create it if it doesn't already exist:
2024-03-22 16:26:43 +00:00
```sql
CREATE DATABASE test;
```
2024-10-24 18:59:03 +00:00
Now create the table `test.visits` that contains the raw data:
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
``` sql
2022-11-07 08:54:45 +00:00
CREATE TABLE test.visits
(
StartDate DateTime64 NOT NULL,
CounterID UInt64,
Sign Nullable(Int32),
UserID Nullable(Int32)
) ENGINE = MergeTree ORDER BY (StartDate, CounterID);
```
2024-10-24 18:59:03 +00:00
Next, you need an `AggregatingMergeTree` table that will store `AggregationFunction` s that keep track of the total number of visits and the number of unique users.
2024-03-22 16:26:43 +00:00
2024-10-24 18:59:03 +00:00
Create an `AggregatingMergeTree` materialized view that watches the `test.visits` table, and uses the `AggregateFunction` type:
2022-11-07 08:54:45 +00:00
``` sql
2024-03-22 16:26:43 +00:00
CREATE TABLE test.agg_visits (
2022-11-07 08:54:45 +00:00
StartDate DateTime64 NOT NULL,
CounterID UInt64,
Visits AggregateFunction(sum, Nullable(Int32)),
Users AggregateFunction(uniq, Nullable(Int32))
)
2024-03-22 16:26:43 +00:00
ENGINE = AggregatingMergeTree() ORDER BY (StartDate, CounterID);
```
2024-10-24 18:59:03 +00:00
Create a materialized view that populates `test.agg_visits` from `test.visits` :
2024-03-22 16:26:43 +00:00
```sql
CREATE MATERIALIZED VIEW test.visits_mv TO test.agg_visits
2017-12-28 15:13:23 +00:00
AS SELECT
StartDate,
2022-11-07 08:54:45 +00:00
CounterID,
sumState(Sign) AS Visits,
2017-12-28 15:13:23 +00:00
uniqState(UserID) AS Users
FROM test.visits
2022-11-07 08:54:45 +00:00
GROUP BY StartDate, CounterID;
2017-12-28 15:13:23 +00:00
```
2024-10-24 18:59:03 +00:00
Insert data into the `test.visits` table:
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
``` sql
2022-11-07 08:54:45 +00:00
INSERT INTO test.visits (StartDate, CounterID, Sign, UserID)
2024-03-22 16:26:43 +00:00
VALUES (1667446031000, 1, 3, 4), (1667446031000, 1, 6, 3);
2017-12-28 15:13:23 +00:00
```
2024-03-22 16:26:43 +00:00
The data is inserted in both `test.visits` and `test.agg_visits` .
2018-10-19 11:25:22 +00:00
2024-10-24 18:59:03 +00:00
To get the aggregated data, execute a query such as `SELECT ... GROUP BY ...` from the materialized view `test.mv_visits` :
2017-12-28 15:13:23 +00:00
2024-03-22 16:26:43 +00:00
```sql
2017-12-28 15:13:23 +00:00
SELECT
StartDate,
sumMerge(Visits) AS Visits,
uniqMerge(Users) AS Users
2024-03-22 16:26:43 +00:00
FROM test.agg_visits
2017-12-28 15:13:23 +00:00
GROUP BY StartDate
ORDER BY StartDate;
```
2023-04-10 14:23:00 +00:00
2024-03-22 16:26:43 +00:00
```text
┌───────────────StartDate─┬─Visits─┬─Users─┐
│ 2022-11-03 03:27:11.000 │ 9 │ 2 │
└─────────────────────────┴────────┴───────┘
```
2024-10-24 18:59:03 +00:00
Add another couple of records to `test.visits` , but this time try using a different timestamp for one of the records:
2024-03-22 16:26:43 +00:00
```sql
INSERT INTO test.visits (StartDate, CounterID, Sign, UserID)
VALUES (1669446031000, 2, 5, 10), (1667446031000, 3, 7, 5);
```
2024-10-24 18:59:03 +00:00
Run the `SELECT` query again, which will return the following output:
2024-03-22 16:26:43 +00:00
```text
┌───────────────StartDate─┬─Visits─┬─Users─┐
│ 2022-11-03 03:27:11.000 │ 16 │ 3 │
│ 2022-11-26 07:00:31.000 │ 5 │ 1 │
└─────────────────────────┴────────┴───────┘
```
2023-04-10 14:23:00 +00:00
## Related Content
- Blog: [Using Aggregate Combinators in ClickHouse ](https://clickhouse.com/blog/aggregate-functions-combinators-in-clickhouse-for-arrays-maps-and-states )