mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-09-20 08:40:50 +00:00
Added examples and extra info to projections
I have added two examples about how we can use projections
This commit is contained in:
parent
2a1268a9cf
commit
2a48fb344d
@ -2,9 +2,134 @@
|
||||
slug: /en/sql-reference/statements/alter/projection
|
||||
sidebar_position: 49
|
||||
sidebar_label: PROJECTION
|
||||
title: "Manipulating Projections"
|
||||
title: "Projections"
|
||||
---
|
||||
|
||||
Projections store data in a format that optimizes query execution, this feature is useful if:
|
||||
You need to run queries on a column that is not a part of the primary key,
|
||||
For Pre-aggregate columns, it will reduce both computation and IO.
|
||||
|
||||
You can define one or more projections for a table, and during the query analysis the projection with least data to scan will be selected by ClickHouse without modifying the query provided by the user.
|
||||
|
||||
## Example filtering without using primary keys
|
||||
|
||||
Creating the table:
|
||||
```
|
||||
CREATE TABLE visits_order
|
||||
(
|
||||
`user_id` UInt64,
|
||||
`user_name` String,
|
||||
`pages_visited` Nullable(Float64),
|
||||
`user_agent` String
|
||||
)
|
||||
ENGINE = MergeTree()
|
||||
PRIMARY KEY user_agent
|
||||
```
|
||||
Using `ALTER TABLE`, we could add the Projection to an existing table:
|
||||
```
|
||||
ALTER TABLE visits_order ADD PROJECTION user_name_projection (
|
||||
SELECT
|
||||
*
|
||||
ORDER BY user_name
|
||||
)
|
||||
|
||||
ALTER TABLE visits_order MATERIALIZE PROJECTION user_name_projection
|
||||
```
|
||||
Inserting the data:
|
||||
```
|
||||
INSERT INTO visits_order SELECT
|
||||
number,
|
||||
'test',
|
||||
1.5 * (number / 2),
|
||||
'Android'
|
||||
FROM numbers(1, 100);
|
||||
```
|
||||
|
||||
The Projection will allow us to filter by `user_name` fast even if in the original Table `user_name` was not defined as a `PRIMARY_KEY`.
|
||||
At query time ClickHouse determined that less data will be processed if the projection is used, as the data is ordered by `user_name`.
|
||||
```
|
||||
SELECT
|
||||
*
|
||||
FROM visits_order
|
||||
WHERE user_name='test'
|
||||
LIMIT 2
|
||||
```
|
||||
|
||||
To verify that a query is using the projection, we could review the `system.query_log` table. On the `projections` field we have the name of the projection used or empty if none has been used:
|
||||
```
|
||||
SELECT query, projections FROM system.query_log WHERE query_id='<query_id>'
|
||||
```
|
||||
|
||||
## Example for pre-aggregate query
|
||||
|
||||
Creating the table with the Projection:
|
||||
```
|
||||
CREATE TABLE visits
|
||||
(
|
||||
`user_id` UInt64,
|
||||
`user_name` String,
|
||||
`pages_visited` Nullable(Float64),
|
||||
`user_agent` String,
|
||||
PROJECTION projection_visits_by_user
|
||||
(
|
||||
SELECT
|
||||
user_agent,
|
||||
sum(pages_visited)
|
||||
GROUP BY user_id, user_agent
|
||||
)
|
||||
)
|
||||
ENGINE = MergeTree()
|
||||
ORDER BY user_agent
|
||||
```
|
||||
Inserting the data:
|
||||
```
|
||||
INSERT INTO visits SELECT
|
||||
number,
|
||||
'test',
|
||||
1.5 * (number / 2),
|
||||
'Android'
|
||||
FROM numbers(1, 100);
|
||||
```
|
||||
```
|
||||
INSERT INTO visits SELECT
|
||||
number,
|
||||
'test',
|
||||
1. * (number / 2),
|
||||
'IOS'
|
||||
FROM numbers(100, 500);
|
||||
```
|
||||
We will execute a first query using `GROUP BY` using the field `user_agent`, this query will not use the projection defined as the pre-aggregate do not match.
|
||||
```
|
||||
SELECT
|
||||
user_agent,
|
||||
count(DISTINCT user_id)
|
||||
FROM visits
|
||||
GROUP BY user_agent
|
||||
```
|
||||
|
||||
To use the projection we could execute queries that select partially or all the pre-aggregate and `GROUP BY` fields.
|
||||
```
|
||||
SELECT
|
||||
user_agent
|
||||
FROM visits
|
||||
WHERE user_id > 50 AND user_id < 150
|
||||
GROUP BY user_agent
|
||||
```
|
||||
```
|
||||
SELECT
|
||||
user_agent,
|
||||
sum(pages_visited)
|
||||
FROM visits
|
||||
GROUP BY user_id
|
||||
```
|
||||
|
||||
As mentioned before, we could review the `system.query_log` table. On the `projections` field we have the name of the projection used or empty if none has been used:
|
||||
```
|
||||
SELECT query, projections FROM system.query_log WHERE query_id='<query_id>'
|
||||
```
|
||||
|
||||
# Manipulating Projections
|
||||
|
||||
The following operations with [projections](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#projections) are available:
|
||||
|
||||
## ADD PROJECTION
|
||||
|
Loading…
Reference in New Issue
Block a user