Merge pull request #63414 from rschu1ze/docs-update

Docs: Various minor docs updates
2024-11-22 23:52:03 +00:00 · 2024-05-06 14:40:14 +00:00 · 2024-05-06 14:40:14 +00:00 · a65e208892
commit a65e208892
parent 21b512c603 b00c64fe9d
4 changed files with 37 additions and 78 deletions
--- a/docs/en/development/build-cross-s390x.md
+++ b/docs/en/development/build-cross-s390x.md
@ -5,22 +5,13 @@ title: How to Build, Run and Debug ClickHouse on Linux for s390x (zLinux)
 sidebar_label: Build on Linux for s390x (zLinux)
 ---
-As of writing (2023/3/10) building for s390x considered to be experimental. Not all features can be enabled, has broken features and is currently under active development. 
+At the time of writing (2024 May), support for the s390x platform is considered experimental, i.e. some features are disabled or broken on s390x.
 ## Building ClickHouse for s390x
-## Building
+s390x has two OpenSSL-related build options:
-
+- By default, OpenSSL is build on s390x as a shared library. This is different from all other platforms, where OpenSSL is build as static library.
-s390x has two OpenSSL-related build options. 
+- To build OpenSSL as a static library regardless, pass `-DENABLE_OPENSSL_DYNAMIC=0` to CMake.
 - By default, the s390x build will dynamically link to OpenSSL libraries. It will build OpenSSL shared objects, so it's not necessary to install OpenSSL beforehand. (This option is recommended in all cases.)
 - Another option is to build OpenSSL in-tree. In this case two build flags need to be supplied to cmake
 ```bash
 -DENABLE_OPENSSL_DYNAMIC=0
 ```
 :::note
 s390x builds are temporarily disabled in CI.
 :::
 These instructions assume that the host machine is x86_64 and has all the tooling required to build natively based on the [build instructions](../development/build.md). It also assumes that the host is Ubuntu 22.04 but the following instructions should also work on Ubuntu 20.04.
@ -31,11 +22,16 @@ apt-get install binutils-s390x-linux-gnu libc6-dev-s390x-cross gcc-s390x-linux-g
 ```
 If you wish to cross compile rust code install the rust cross compile target for s390x:
 ```bash
 rustup target add s390x-unknown-linux-gnu
 ```
 The s390x build uses the mold linker, download it from https://github.com/rui314/mold/releases/download/v2.0.0/mold-2.0.0-x86_64-linux.tar.gz
 and place it into your `$PATH`.
 To build for s390x:
 ```bash
 cmake -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-s390x.cmake ..
 ninja
--- a/docs/en/engines/table-engines/mergetree-family/annindexes.md
+++ b/docs/en/engines/table-engines/mergetree-family/annindexes.md
@ -22,9 +22,8 @@ ORDER BY Distance(vectors, Point)
 LIMIT N
 ```
-`vectors` contains N-dimensional values of type [Array](../../../sql-reference/data-types/array.md) or
+`vectors` contains N-dimensional values of type [Array(Float32)](../../../sql-reference/data-types/array.md), for example embeddings.
-[Tuple](../../../sql-reference/data-types/tuple.md), for example embeddings. Function `Distance` computes the distance between two vectors.
+Function `Distance` computes the distance between two vectors. Often, the Euclidean (L2) distance is chosen as distance function but [other
 Often, the Euclidean (L2) distance is chosen as distance function but [other
 distance functions](/docs/en/sql-reference/functions/distance-functions.md) are also possible. `Point` is the reference point, e.g. `(0.17,
 0.33, ...)`, and `N` limits the number of search results.
@ -47,7 +46,7 @@ of the search space (using clustering, search trees, etc.) which allows to compu
 # Creating and Using ANN Indexes {#creating_using_ann_indexes}
-Syntax to create an ANN index over an [Array](../../../sql-reference/data-types/array.md) column:
+Syntax to create an ANN index over an [Array(Float32)](../../../sql-reference/data-types/array.md) column:
 ```sql
 CREATE TABLE table_with_ann_index
@ -60,19 +59,6 @@ ENGINE = MergeTree
 ORDER BY id;
 ```
 Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
 ```sql
 CREATE TABLE table_with_ann_index
 (
  `id` Int64,
  `vectors` Tuple(Float32[, Float32[, ...]]),
  INDEX [ann_index_name] vectors TYPE [ann_index_type]([ann_index_parameters]) [GRANULARITY [N]]
 )
 ENGINE = MergeTree
 ORDER BY id;
 ```
 ANN indexes are built during column insertion and merge. As a result, `INSERT` and `OPTIMIZE` statements will be slower than for ordinary
 tables. ANNIndexes are ideally used only with immutable or rarely changed data, respectively when are far more read requests than write
 requests.
@ -164,7 +150,7 @@ linear surfaces (lines in 2D, planes in 3D etc.).
  </iframe>
 </div>
-Syntax to create an Annoy index over an [Array](../../../sql-reference/data-types/array.md) column:
+Syntax to create an Annoy index over an [Array(Float32)](../../../sql-reference/data-types/array.md) column:
 ```sql
 CREATE TABLE table_with_annoy_index
@ -177,19 +163,6 @@ ENGINE = MergeTree
 ORDER BY id;
 ```
 Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
 ```sql
 CREATE TABLE table_with_annoy_index
 (
  id Int64,
  vectors Tuple(Float32[, Float32[, ...]]),
  INDEX [ann_index_name] vectors TYPE annoy([Distance[, NumTrees]]) [GRANULARITY N]
 )
 ENGINE = MergeTree
 ORDER BY id;
 ```
 Annoy currently supports two distance functions:
 - `L2Distance`, also called Euclidean distance, is the length of a line segment between two points in Euclidean space
  ([Wikipedia](https://en.wikipedia.org/wiki/Euclidean_distance)).
@ -203,10 +176,9 @@ Parameter `NumTrees` is the number of trees which the algorithm creates (default
 more accurate search results but slower index creation / query times (approximately linearly) as well as larger index sizes.
 :::note
-Indexes over columns of type `Array` will generally work faster than indexes on `Tuple` columns. All arrays must have same length. To avoid
+All arrays must have same length. To avoid errors, you can use a
-errors, you can use a [CONSTRAINT](/docs/en/sql-reference/statements/create/table.md#constraints), for example, `CONSTRAINT
+[CONSTRAINT](/docs/en/sql-reference/statements/create/table.md#constraints), for example, `CONSTRAINT constraint_name_1 CHECK
-constraint_name_1 CHECK length(vectors) = 256`. Also, empty `Arrays` and unspecified `Array` values in INSERT statements (i.e. default
+length(vectors) = 256`. Also, empty `Arrays` and unspecified `Array` values in INSERT statements (i.e. default values) are not supported.
 values) are not supported.
 :::
 The creation of Annoy indexes (whenever a new part is build, e.g. at the end of a merge) is a relatively slow process. You can increase
@ -264,19 +236,6 @@ ENGINE = MergeTree
 ORDER BY id;
 ```
 Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
 ```sql
 CREATE TABLE table_with_usearch_index
 (
  id Int64,
  vectors Tuple(Float32[, Float32[, ...]]),
  INDEX [ann_index_name] vectors TYPE usearch([Distance[, ScalarKind]]) [GRANULARITY N]
 )
 ENGINE = MergeTree
 ORDER BY id;
 ```
 USearch currently supports two distance functions:
 - `L2Distance`, also called Euclidean distance, is the length of a line segment between two points in Euclidean space
  ([Wikipedia](https://en.wikipedia.org/wiki/Euclidean_distance)).
--- a/docs/en/engines/table-engines/mergetree-family/invertedindexes.md
+++ b/docs/en/engines/table-engines/mergetree-family/invertedindexes.md
@ -53,6 +53,10 @@ ENGINE = MergeTree
 ORDER BY key
 ```
 :::note
 In earlier versions of ClickHouse, the corresponding index type name was `inverted`.
 :::
 where `N` specifies the tokenizer:
 - `full_text(0)` (or shorter: `full_text()`) set the tokenizer to "tokens", i.e. split strings along spaces,
--- a/docs/en/sql-reference/functions/date-time-functions.md
+++ b/docs/en/sql-reference/functions/date-time-functions.md
@ -1417,31 +1417,31 @@ toStartOfFifteenMinutes(toDateTime('2023-04-21 10:23:00')): 2023-04-21 10:15:00
 This function generalizes other `toStartOf*()` functions with `toStartOfInterval(date_or_date_with_time, INTERVAL x unit [, time_zone])` syntax.
 For example,
- `toStartOfInterval(t, INTERVAL 1 year)` returns the same as `toStartOfYear(t)`,
+- `toStartOfInterval(t, INTERVAL 1 YEAR)` returns the same as `toStartOfYear(t)`,
- `toStartOfInterval(t, INTERVAL 1 month)` returns the same as `toStartOfMonth(t)`,
+- `toStartOfInterval(t, INTERVAL 1 MONTH)` returns the same as `toStartOfMonth(t)`,
- `toStartOfInterval(t, INTERVAL 1 day)` returns the same as `toStartOfDay(t)`,
+- `toStartOfInterval(t, INTERVAL 1 DAY)` returns the same as `toStartOfDay(t)`,
- `toStartOfInterval(t, INTERVAL 15 minute)` returns the same as `toStartOfFifteenMinutes(t)`.
+- `toStartOfInterval(t, INTERVAL 15 MINUTE)` returns the same as `toStartOfFifteenMinutes(t)`.
 The calculation is performed relative to specific points in time:
 | Interval    | Start                  |
 |-------------|------------------------|
-| year        | year 0                 |
+| YEAR        | year 0                 |
-| quarter     | 1900 Q1                |
+| QUARTER     | 1900 Q1                |
-| month       | 1900 January           |
+| MONTH       | 1900 January           |
-| week        | 1970, 1st week (01-05) |
+| WEEK        | 1970, 1st week (01-05) |
-| day         | 1970-01-01             |
+| DAY         | 1970-01-01             |
-| hour        | (*)                    |
+| HOUR        | (*)                    |
-| minute      | 1970-01-01 00:00:00    |
+| MINUTE      | 1970-01-01 00:00:00    |
-| second      | 1970-01-01 00:00:00    |
+| SECOND      | 1970-01-01 00:00:00    |
-| millisecond | 1970-01-01 00:00:00    |
+| MILLISECOND | 1970-01-01 00:00:00    |
-| microsecond | 1970-01-01 00:00:00    |
+| MICROSECOND | 1970-01-01 00:00:00    |
-| nanosecond  | 1970-01-01 00:00:00    |
+| NANOSECOND  | 1970-01-01 00:00:00    |
 (*) hour intervals are special: the calculation is always performed relative to 00:00:00 (midnight) of the current day. As a result, only
    hour values between 1 and 23 are useful.
-If unit `week` was specified, `toStartOfInterval` assumes that weeks start on Monday. Note that this behavior is different from that of function `toStartOfWeek` in which weeks start by default on Sunday.
+If unit `WEEK` was specified, `toStartOfInterval` assumes that weeks start on Monday. Note that this behavior is different from that of function `toStartOfWeek` in which weeks start by default on Sunday.
 **See Also**