mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-10 01:25:21 +00:00
Merge pull request #63414 from rschu1ze/docs-update
Docs: Various minor docs updates
This commit is contained in:
commit
a65e208892
@ -5,22 +5,13 @@ title: How to Build, Run and Debug ClickHouse on Linux for s390x (zLinux)
|
||||
sidebar_label: Build on Linux for s390x (zLinux)
|
||||
---
|
||||
|
||||
As of writing (2023/3/10) building for s390x considered to be experimental. Not all features can be enabled, has broken features and is currently under active development.
|
||||
At the time of writing (2024 May), support for the s390x platform is considered experimental, i.e. some features are disabled or broken on s390x.
|
||||
|
||||
## Building ClickHouse for s390x
|
||||
|
||||
## Building
|
||||
|
||||
s390x has two OpenSSL-related build options.
|
||||
- By default, the s390x build will dynamically link to OpenSSL libraries. It will build OpenSSL shared objects, so it's not necessary to install OpenSSL beforehand. (This option is recommended in all cases.)
|
||||
- Another option is to build OpenSSL in-tree. In this case two build flags need to be supplied to cmake
|
||||
```bash
|
||||
-DENABLE_OPENSSL_DYNAMIC=0
|
||||
```
|
||||
|
||||
:::note
|
||||
s390x builds are temporarily disabled in CI.
|
||||
:::
|
||||
|
||||
s390x has two OpenSSL-related build options:
|
||||
- By default, OpenSSL is build on s390x as a shared library. This is different from all other platforms, where OpenSSL is build as static library.
|
||||
- To build OpenSSL as a static library regardless, pass `-DENABLE_OPENSSL_DYNAMIC=0` to CMake.
|
||||
|
||||
These instructions assume that the host machine is x86_64 and has all the tooling required to build natively based on the [build instructions](../development/build.md). It also assumes that the host is Ubuntu 22.04 but the following instructions should also work on Ubuntu 20.04.
|
||||
|
||||
@ -31,11 +22,16 @@ apt-get install binutils-s390x-linux-gnu libc6-dev-s390x-cross gcc-s390x-linux-g
|
||||
```
|
||||
|
||||
If you wish to cross compile rust code install the rust cross compile target for s390x:
|
||||
|
||||
```bash
|
||||
rustup target add s390x-unknown-linux-gnu
|
||||
```
|
||||
|
||||
The s390x build uses the mold linker, download it from https://github.com/rui314/mold/releases/download/v2.0.0/mold-2.0.0-x86_64-linux.tar.gz
|
||||
and place it into your `$PATH`.
|
||||
|
||||
To build for s390x:
|
||||
|
||||
```bash
|
||||
cmake -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-s390x.cmake ..
|
||||
ninja
|
||||
|
@ -22,9 +22,8 @@ ORDER BY Distance(vectors, Point)
|
||||
LIMIT N
|
||||
```
|
||||
|
||||
`vectors` contains N-dimensional values of type [Array](../../../sql-reference/data-types/array.md) or
|
||||
[Tuple](../../../sql-reference/data-types/tuple.md), for example embeddings. Function `Distance` computes the distance between two vectors.
|
||||
Often, the Euclidean (L2) distance is chosen as distance function but [other
|
||||
`vectors` contains N-dimensional values of type [Array(Float32)](../../../sql-reference/data-types/array.md), for example embeddings.
|
||||
Function `Distance` computes the distance between two vectors. Often, the Euclidean (L2) distance is chosen as distance function but [other
|
||||
distance functions](/docs/en/sql-reference/functions/distance-functions.md) are also possible. `Point` is the reference point, e.g. `(0.17,
|
||||
0.33, ...)`, and `N` limits the number of search results.
|
||||
|
||||
@ -47,7 +46,7 @@ of the search space (using clustering, search trees, etc.) which allows to compu
|
||||
|
||||
# Creating and Using ANN Indexes {#creating_using_ann_indexes}
|
||||
|
||||
Syntax to create an ANN index over an [Array](../../../sql-reference/data-types/array.md) column:
|
||||
Syntax to create an ANN index over an [Array(Float32)](../../../sql-reference/data-types/array.md) column:
|
||||
|
||||
```sql
|
||||
CREATE TABLE table_with_ann_index
|
||||
@ -60,19 +59,6 @@ ENGINE = MergeTree
|
||||
ORDER BY id;
|
||||
```
|
||||
|
||||
Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
|
||||
|
||||
```sql
|
||||
CREATE TABLE table_with_ann_index
|
||||
(
|
||||
`id` Int64,
|
||||
`vectors` Tuple(Float32[, Float32[, ...]]),
|
||||
INDEX [ann_index_name] vectors TYPE [ann_index_type]([ann_index_parameters]) [GRANULARITY [N]]
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY id;
|
||||
```
|
||||
|
||||
ANN indexes are built during column insertion and merge. As a result, `INSERT` and `OPTIMIZE` statements will be slower than for ordinary
|
||||
tables. ANNIndexes are ideally used only with immutable or rarely changed data, respectively when are far more read requests than write
|
||||
requests.
|
||||
@ -164,7 +150,7 @@ linear surfaces (lines in 2D, planes in 3D etc.).
|
||||
</iframe>
|
||||
</div>
|
||||
|
||||
Syntax to create an Annoy index over an [Array](../../../sql-reference/data-types/array.md) column:
|
||||
Syntax to create an Annoy index over an [Array(Float32)](../../../sql-reference/data-types/array.md) column:
|
||||
|
||||
```sql
|
||||
CREATE TABLE table_with_annoy_index
|
||||
@ -177,19 +163,6 @@ ENGINE = MergeTree
|
||||
ORDER BY id;
|
||||
```
|
||||
|
||||
Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
|
||||
|
||||
```sql
|
||||
CREATE TABLE table_with_annoy_index
|
||||
(
|
||||
id Int64,
|
||||
vectors Tuple(Float32[, Float32[, ...]]),
|
||||
INDEX [ann_index_name] vectors TYPE annoy([Distance[, NumTrees]]) [GRANULARITY N]
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY id;
|
||||
```
|
||||
|
||||
Annoy currently supports two distance functions:
|
||||
- `L2Distance`, also called Euclidean distance, is the length of a line segment between two points in Euclidean space
|
||||
([Wikipedia](https://en.wikipedia.org/wiki/Euclidean_distance)).
|
||||
@ -203,10 +176,9 @@ Parameter `NumTrees` is the number of trees which the algorithm creates (default
|
||||
more accurate search results but slower index creation / query times (approximately linearly) as well as larger index sizes.
|
||||
|
||||
:::note
|
||||
Indexes over columns of type `Array` will generally work faster than indexes on `Tuple` columns. All arrays must have same length. To avoid
|
||||
errors, you can use a [CONSTRAINT](/docs/en/sql-reference/statements/create/table.md#constraints), for example, `CONSTRAINT
|
||||
constraint_name_1 CHECK length(vectors) = 256`. Also, empty `Arrays` and unspecified `Array` values in INSERT statements (i.e. default
|
||||
values) are not supported.
|
||||
All arrays must have same length. To avoid errors, you can use a
|
||||
[CONSTRAINT](/docs/en/sql-reference/statements/create/table.md#constraints), for example, `CONSTRAINT constraint_name_1 CHECK
|
||||
length(vectors) = 256`. Also, empty `Arrays` and unspecified `Array` values in INSERT statements (i.e. default values) are not supported.
|
||||
:::
|
||||
|
||||
The creation of Annoy indexes (whenever a new part is build, e.g. at the end of a merge) is a relatively slow process. You can increase
|
||||
@ -264,19 +236,6 @@ ENGINE = MergeTree
|
||||
ORDER BY id;
|
||||
```
|
||||
|
||||
Syntax to create an ANN index over a [Tuple](../../../sql-reference/data-types/tuple.md) column:
|
||||
|
||||
```sql
|
||||
CREATE TABLE table_with_usearch_index
|
||||
(
|
||||
id Int64,
|
||||
vectors Tuple(Float32[, Float32[, ...]]),
|
||||
INDEX [ann_index_name] vectors TYPE usearch([Distance[, ScalarKind]]) [GRANULARITY N]
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY id;
|
||||
```
|
||||
|
||||
USearch currently supports two distance functions:
|
||||
- `L2Distance`, also called Euclidean distance, is the length of a line segment between two points in Euclidean space
|
||||
([Wikipedia](https://en.wikipedia.org/wiki/Euclidean_distance)).
|
||||
|
@ -53,6 +53,10 @@ ENGINE = MergeTree
|
||||
ORDER BY key
|
||||
```
|
||||
|
||||
:::note
|
||||
In earlier versions of ClickHouse, the corresponding index type name was `inverted`.
|
||||
:::
|
||||
|
||||
where `N` specifies the tokenizer:
|
||||
|
||||
- `full_text(0)` (or shorter: `full_text()`) set the tokenizer to "tokens", i.e. split strings along spaces,
|
||||
|
@ -1417,31 +1417,31 @@ toStartOfFifteenMinutes(toDateTime('2023-04-21 10:23:00')): 2023-04-21 10:15:00
|
||||
|
||||
This function generalizes other `toStartOf*()` functions with `toStartOfInterval(date_or_date_with_time, INTERVAL x unit [, time_zone])` syntax.
|
||||
For example,
|
||||
- `toStartOfInterval(t, INTERVAL 1 year)` returns the same as `toStartOfYear(t)`,
|
||||
- `toStartOfInterval(t, INTERVAL 1 month)` returns the same as `toStartOfMonth(t)`,
|
||||
- `toStartOfInterval(t, INTERVAL 1 day)` returns the same as `toStartOfDay(t)`,
|
||||
- `toStartOfInterval(t, INTERVAL 15 minute)` returns the same as `toStartOfFifteenMinutes(t)`.
|
||||
- `toStartOfInterval(t, INTERVAL 1 YEAR)` returns the same as `toStartOfYear(t)`,
|
||||
- `toStartOfInterval(t, INTERVAL 1 MONTH)` returns the same as `toStartOfMonth(t)`,
|
||||
- `toStartOfInterval(t, INTERVAL 1 DAY)` returns the same as `toStartOfDay(t)`,
|
||||
- `toStartOfInterval(t, INTERVAL 15 MINUTE)` returns the same as `toStartOfFifteenMinutes(t)`.
|
||||
|
||||
The calculation is performed relative to specific points in time:
|
||||
|
||||
| Interval | Start |
|
||||
|-------------|------------------------|
|
||||
| year | year 0 |
|
||||
| quarter | 1900 Q1 |
|
||||
| month | 1900 January |
|
||||
| week | 1970, 1st week (01-05) |
|
||||
| day | 1970-01-01 |
|
||||
| hour | (*) |
|
||||
| minute | 1970-01-01 00:00:00 |
|
||||
| second | 1970-01-01 00:00:00 |
|
||||
| millisecond | 1970-01-01 00:00:00 |
|
||||
| microsecond | 1970-01-01 00:00:00 |
|
||||
| nanosecond | 1970-01-01 00:00:00 |
|
||||
| YEAR | year 0 |
|
||||
| QUARTER | 1900 Q1 |
|
||||
| MONTH | 1900 January |
|
||||
| WEEK | 1970, 1st week (01-05) |
|
||||
| DAY | 1970-01-01 |
|
||||
| HOUR | (*) |
|
||||
| MINUTE | 1970-01-01 00:00:00 |
|
||||
| SECOND | 1970-01-01 00:00:00 |
|
||||
| MILLISECOND | 1970-01-01 00:00:00 |
|
||||
| MICROSECOND | 1970-01-01 00:00:00 |
|
||||
| NANOSECOND | 1970-01-01 00:00:00 |
|
||||
|
||||
(*) hour intervals are special: the calculation is always performed relative to 00:00:00 (midnight) of the current day. As a result, only
|
||||
hour values between 1 and 23 are useful.
|
||||
|
||||
If unit `week` was specified, `toStartOfInterval` assumes that weeks start on Monday. Note that this behavior is different from that of function `toStartOfWeek` in which weeks start by default on Sunday.
|
||||
If unit `WEEK` was specified, `toStartOfInterval` assumes that weeks start on Monday. Note that this behavior is different from that of function `toStartOfWeek` in which weeks start by default on Sunday.
|
||||
|
||||
**See Also**
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user