Update Annoy docs

2024-11-10 09:32:06 +00:00 · 2023-06-12 20:06:57 +00:00 · 2023-06-12 20:06:57 +00:00 · 4f39ee51ae
commit 4f39ee51ae
parent 33ab8ee95c
4 changed files with 17 additions and 14 deletions
--- a/docs/en/engines/table-engines/mergetree-family/annindexes.md
+++ b/docs/en/engines/table-engines/mergetree-family/annindexes.md
@ -100,7 +100,7 @@ ANN indexes support two types of queries:

 :::tip
 To avoid writing out large vectors, you can use [query
-parameters](/docs/en//interfaces/cli.md#queries-with-parameters-cli-queries-with-parameters), e.g.
+parameters](/docs/en/interfaces/cli.md#queries-with-parameters-cli-queries-with-parameters), e.g.

 ```bash
 clickhouse-client --param_vec='hello' --query="SELECT * FROM table WHERE L2Distance(vectors, {vec: Array(Float32)}) < 1.0"
@ -128,14 +128,14 @@ granularity of granules, sub-indexes extrapolate matching rows to granule granul
 skip data at the granularity of index blocks.

 The `GRANULARITY` parameter determines how many ANN sub-indexes are created. Bigger `GRANULARITY` values mean fewer but larger ANN
-sub-indexes, up to the point where a column (or a column part) has only a single sub-index. In that case, the sub-index has a "global" view of
-all column rows and can directly return all granules of the column (part) with relevant rows (there are at at most `LIMIT <N>`-many
-such granules). In a second step, ClickHouse will load these granules and identify the actually best rows by performing a brute-force distance
-calculation over all rows of the granules. With a small `GRANULARITY` value, each of the sub-indexes returns up to `LIMIT N`-many granules.
-As a result, more granules need to be loaded and post-filtered. Note that the search accuracy is with both cases equally good, only the
-processing performance differs. It is generally recommended to use a large `GRANULARITY` for ANN indexes and fall back to a smaller
-`GRANULARITY` values only in case of problems like excessive memory consumption of the ANN structures. If no `GRANULARITY` was specified for
-ANN indexes, the default value is 100 million.
+sub-indexes, up to the point where a column (or a column's data part) has only a single sub-index. In that case, the sub-index has a
+"global" view of all column rows and can directly return all granules of the column (part) with relevant rows (there are at most `LIMIT
+<N>`-many such granules). In a second step, ClickHouse will load these granules and identify the actually best rows by performing a
+brute-force distance calculation over all rows of the granules. With a small `GRANULARITY` value, each of the sub-indexes returns up to
+`LIMIT N`-many granules. As a result, more granules need to be loaded and post-filtered. Note that the search accuracy is with both cases
+equally good, only the processing performance differs. It is generally recommended to use a large `GRANULARITY` for ANN indexes and fall
+back to a smaller `GRANULARITY` values only in case of problems like excessive memory consumption of the ANN structures. If no `GRANULARITY`
+was specified for ANN indexes, the default value is 100 million.


 # Available ANN Indexes
@ -204,7 +204,7 @@ values mean more accurate results at the cost of longer query runtime:

 ``` sql
 SELECT *
-FROM table_name [WHERE ...]
+FROM table_name
 ORDER BY L2Distance(vectors, Point)
 LIMIT N
 SETTINGS annoy_index_search_k_nodes=100
--- a/src/Parsers/ASTIndexDeclaration.h
+++ b/src/Parsers/ASTIndexDeclaration.h
@ -12,6 +12,9 @@ class ASTFunction;
 class ASTIndexDeclaration : public IAST
 {
 public:
+    static const auto DEFAULT_INDEX_GRANULARITY = 1uz;
+    static const auto DEFAULT_ANNOY_INDEX_GRANULARITY = 100'000'000uz;
+
    String name;
    IAST * expr;
    ASTFunction * type;
--- a/src/Parsers/ParserCreateIndexQuery.cpp
+++ b/src/Parsers/ParserCreateIndexQuery.cpp
@ -52,9 +52,9 @@ bool ParserCreateIndexDeclaration::parseImpl(Pos & pos, ASTPtr & node, Expected
    else
    {
        if (index->type->name == "annoy")
-            index->granularity = 100'000'000;
+            index->granularity = ASTIndexDeclaration::DEFAULT_ANNOY_INDEX_GRANULARITY;
        else
-            index->granularity = 1;
+            index->granularity = ASTIndexDeclaration::DEFAULT_INDEX_GRANULARITY;
    }
    node = index;

--- a/src/Parsers/ParserCreateQuery.cpp
+++ b/src/Parsers/ParserCreateQuery.cpp
@ -147,9 +147,9 @@ bool ParserIndexDeclaration::parseImpl(Pos & pos, ASTPtr & node, Expected & expe
    else
    {
        if (index->type->name == "annoy")
-            index->granularity = 100'000'000;
+            index->granularity = ASTIndexDeclaration::DEFAULT_ANNOY_INDEX_GRANULARITY;
        else
-            index->granularity = 1;
+            index->granularity = ASTIndexDeclaration::DEFAULT_INDEX_GRANULARITY;
    }

    node = index;