mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-17 05:03:20 +00:00
67c2e50331
* update presentations * CLICKHOUSE-2936: redirect from clickhouse.yandex.ru and clickhouse.yandex.com * update submodule * lost files * CLICKHOUSE-2981: prefer sphinx docs over original reference * CLICKHOUSE-2981: docs styles more similar to main website + add flags to switch language links * update presentations * Less confusing directory structure (docs -> doc/reference/) * Minify sphinx docs too * Website release script: fail fast + pass docker hash on deploy * Do not underline links in docs * shorter * cleanup docker images * tune nginx config * CLICKHOUSE-3043: get rid of habrastorage links * Lost translation * CLICKHOUSE-2936: temporary client-side redirect * behaves weird in test * put redirect back * CLICKHOUSE-3047: copy docs txts to public too * move to proper file * remove old pages to avoid confusion * Remove reference redirect warning for now * Refresh README.md * Yellow buttons in docs * Use svg flags instead of unicode ones in docs * fix test website instance * Put flags to separate files * wrong flag * Copy Yandex.Metrica introduction from main page to docs * Yet another home page structure change, couple new blocks (CLICKHOUSE-3045) * Update Contacts section * CLICKHOUSE-2849: more detailed legal information * CLICKHOUSE-2978 preparation - split by files * More changes in Contacts block * Tune texts on index page * update presentations * One more benchmark * Add usage sections to index page, adapted from slides * Get the roadmap started, based on slides from last ClickHouse Meetup * CLICKHOUSE-2977: some rendering tuning * Get rid of excessive section in the end of getting started * Make headers linkable * CLICKHOUSE-2981: links to editing reference - https://github.com/yandex/ClickHouse/issues/849 * CLICKHOUSE-2981: fix mobile styles in docs * Ban crawling of duplicating docs * Open some external links in new tab * Ban old docs too * Lots of trivial fixes in english docs * Lots of trivial fixes in russian docs * Remove getting started copies in markdown * Add Yandex.Webmaster * Fix some sphinx warnings * More warnings fixed in english docs * More sphinx warnings fixed * Add code-block:: text * More code-block:: text * These headers look not that well * Better switch between documentation languages * merge use_case.rst into ya_metrika_task.rst * Edit the agg_functions.rst texts * Add lost empty lines * Lost blank lines * Add new logo sizes * update presentations * Next step in migrating to new documentation * Fix all warnings in en reference * Fix all warnings in ru reference * Re-arrange existing reference * Move operation tips to main reference * Fix typos noticed by milovidov@ * Get rid of zookeeper.md * Looks like duplicate of tutorial.html * Fix some mess with html tags in tutorial * No idea why nobody noticed this before, but it was completely not clear whet to get the data * Match code block styling between main and tutorial pages (in favor of the latter) * Get rid of some copypaste in tutorial * Normalize header styles * Move example_datasets to sphinx * Move presentations submodule to website * Move and update README.md * No point in duplicating articles from habrahabr here * Move development-related docs as is for now * doc/reference/ -> docs/ (to match the URL on website) * Adapt links to match the previous commit * Adapt development docs to rst (still lacks translation and strikethrough support) * clean on release * blacklist presentations in gulp * strikethrough support in sphinx * just copy development folder for now * fix weird introduction in style article * Style guide translation (WIP) * Finish style guide translation to English * gulp clean separately * Update year in LICENSE * Initial CONTRIBUTING.md * Fix remaining links to old docs in tutorial * Some tutorial fixes * Typo * Another typo * Update list of authors from yandex-team accoding to git log
83 lines
3.7 KiB
ReStructuredText
83 lines
3.7 KiB
ReStructuredText
AggregatingMergeTree
|
|
--------------------
|
|
|
|
This engine differs from ``MergeTree`` in that the merge combines the states of aggregate functions stored in the table for rows with the same primary key value.
|
|
|
|
In order for this to work, it uses the AggregateFunction data type and the -State and -Merge modifiers for aggregate functions. Let's examine it more closely.
|
|
|
|
There is an AggregateFunction data type, which is a parametric data type. As parameters, the name of the aggregate function is passed, then the types of its arguments.
|
|
Examples:
|
|
|
|
.. code-block:: sql
|
|
|
|
CREATE TABLE t
|
|
(
|
|
column1 AggregateFunction(uniq, UInt64),
|
|
column2 AggregateFunction(anyIf, String, UInt8),
|
|
column3 AggregateFunction(quantiles(0.5, 0.9), UInt64)
|
|
) ENGINE = ...
|
|
|
|
This type of column stores the state of an aggregate function.
|
|
|
|
To get this type of value, use aggregate functions with the 'State' suffix.
|
|
Example: uniqState(UserID), quantilesState(0.5, 0.9)(SendTiming) - in contrast to the corresponding 'uniq' and 'quantiles' functions, these functions return the state, rather than the prepared value. In other words, they return an AggregateFunction type value.
|
|
|
|
An AggregateFunction type value can't be output in Pretty formats. In other formats, these types of values are output as implementation-specific binary data. The AggregateFunction type values are not intended for output or saving in a dump.
|
|
|
|
The only useful thing you can do with AggregateFunction type values is combine the states and get a result, which essentially means to finish aggregation. Aggregate functions with the 'Merge' suffix are used for this purpose.
|
|
Example: uniqMerge(UserIDState), where UserIDState has the AggregateFunction type.
|
|
|
|
In other words, an aggregate function with the 'Merge' suffix takes a set of states, combines them, and returns the result.
|
|
As an example, these two queries return the same result:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT uniq(UserID) FROM table
|
|
|
|
SELECT uniqMerge(state) FROM (SELECT uniqState(UserID) AS state FROM table GROUP BY RegionID)
|
|
|
|
There is an AggregatingMergeTree engine. Its job during a merge is to combine the states of aggregate functions from different table rows with the same primary key value.
|
|
|
|
You can't use a normal INSERT to insert a row in a table containing AggregateFunction columns, because you can't explicitly define the AggregateFunction value. Instead, use INSERT SELECT with '-State' aggregate functions for inserting data.
|
|
|
|
With SELECT from an AggregatingMergeTree table, use GROUP BY and aggregate functions with the '-Merge' modifier in order to complete data aggregation.
|
|
|
|
You can use AggregatingMergeTree tables for incremental data aggregation, including for aggregated materialized views.
|
|
|
|
Example:
|
|
Creating a materialized AggregatingMergeTree view that tracks the 'test.visits' table:
|
|
|
|
.. code-block:: sql
|
|
|
|
CREATE MATERIALIZED VIEW test.basic
|
|
ENGINE = AggregatingMergeTree(StartDate, (CounterID, StartDate), 8192)
|
|
AS SELECT
|
|
CounterID,
|
|
StartDate,
|
|
sumState(Sign) AS Visits,
|
|
uniqState(UserID) AS Users
|
|
FROM test.visits
|
|
GROUP BY CounterID, StartDate;
|
|
|
|
Inserting data in the 'test.visits' table. Data will also be inserted in the view, where it will be aggregated:
|
|
|
|
.. code-block:: sql
|
|
|
|
INSERT INTO test.visits ...
|
|
|
|
Performing SELECT from the view using GROUP BY to finish data aggregation:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT
|
|
StartDate,
|
|
sumMerge(Visits) AS Visits,
|
|
uniqMerge(Users) AS Users
|
|
FROM test.basic
|
|
GROUP BY StartDate
|
|
ORDER BY StartDate;
|
|
|
|
You can create a materialized view like this and assign a normal view to it that finishes data aggregation.
|
|
|
|
Note that in most cases, using AggregatingMergeTree is not justified, since queries can be run efficiently enough on non-aggregated data.
|