From c43f9030da6fa0464b08a57972048f1a3f31d2f7 Mon Sep 17 00:00:00 2001 From: BayoNet Date: Tue, 20 Aug 2019 18:36:08 +0300 Subject: [PATCH] DOCAPI-7460: Clarifications. --- .../agg_functions/parametric_functions.md | 43 ++++++++++++++++--- .../functions/other_functions.md | 2 +- 2 files changed, 38 insertions(+), 7 deletions(-) diff --git a/docs/en/query_language/agg_functions/parametric_functions.md b/docs/en/query_language/agg_functions/parametric_functions.md index d27cb5d9431..84898a61133 100644 --- a/docs/en/query_language/agg_functions/parametric_functions.md +++ b/docs/en/query_language/agg_functions/parametric_functions.md @@ -4,18 +4,18 @@ Some aggregate functions can accept not only argument columns (used for compress ## histogram -Calculates a histogram. +Calculates an adaptive histogram. It doesn't guarantee precise results. ``` histogram(number_of_bins)(values) ``` - -The functions uses [A Streaming Parallel Decision Tree Algorithm](http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf). It calculates the borders of histogram bins automatically, and in common case the widths of bins are not equal. + +The functions uses [A Streaming Parallel Decision Tree Algorithm](http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf). The borders of histogram bins are adjusted as a new data enters a function, and in common case the widths of bins are not equal. **Parameters** -`number_of_bins` — Number of bins for the histogram. -`values` — [Expression](../syntax.md#syntax-expressions) resulting in a data sample. +`number_of_bins` — Upper limit for a number of bins for the histogram. Function automatically calculates the number of bins. It tries to reach the specified number of bins, but if it fails, it uses less number of bins. +`values` — [Expression](../syntax.md#syntax-expressions) resulting in input values. **Returned values** @@ -32,7 +32,12 @@ The functions uses [A Streaming Parallel Decision Tree Algorithm](http://jmlr.or **Example** ```sql -SELECT histogram(5)(number + 1) FROM (SELECT * FROM system.numbers LIMIT 20) +SELECT histogram(5)(number + 1) +FROM ( + SELECT * + FROM system.numbers + LIMIT 20 +) ``` ```text ┌─histogram(5)(plus(number, 1))───────────────────────────────────────────┐ @@ -40,6 +45,32 @@ SELECT histogram(5)(number + 1) FROM (SELECT * FROM system.numbers LIMIT 20) └─────────────────────────────────────────────────────────────────────────┘ ``` +You can visualize a histogram with the [bar](../other_functions.md#function-bar) function, for example: + +```sql +WITH histogram(5)(rand() % 100) AS hist +SELECT + arrayJoin(hist).3 AS height, + bar(height, 0, 6, 5) AS bar +FROM +( + SELECT * + FROM system.numbers + LIMIT 20 +) +``` +```text +┌─height─┬─bar───┐ +│ 2.125 │ █▋ │ +│ 3.25 │ ██▌ │ +│ 5.625 │ ████▏ │ +│ 5.625 │ ████▏ │ +│ 3.375 │ ██▌ │ +└────────┴───────┘ +``` + +In this case you should remember, that you don't know the borders of histogram bins. + ## sequenceMatch(pattern)(time, cond1, cond2, ...) Pattern matching for event chains. diff --git a/docs/en/query_language/functions/other_functions.md b/docs/en/query_language/functions/other_functions.md index 007f1352775..268c245d24f 100644 --- a/docs/en/query_language/functions/other_functions.md +++ b/docs/en/query_language/functions/other_functions.md @@ -120,7 +120,7 @@ Accepts constant strings: database name, table name, and column name. Returns a The function throws an exception if the table does not exist. For elements in a nested data structure, the function checks for the existence of a column. For the nested data structure itself, the function returns 0. -## bar +## bar {#function-bar} Allows building a unicode-art diagram.