Commit Graph

62 Commits

Author SHA1 Message Date
Robert Schulze
38b5ea9066
Fix docs 2024-09-12 12:43:27 +00:00
Robert Schulze
fe5e061fff
Some fixups 2024-09-12 10:38:14 +00:00
Robert Schulze
7a98f7fecc
Add testcase for ANN index usage with subquery 2024-09-04 10:21:19 +00:00
Robert Schulze
c40c8b7adb
Enable bf16 + f64 quantization, make bf16 the default 2024-08-23 07:32:34 +00:00
Robert Schulze
38a2b0dcc7
Allow Array(Float64) as type of underlying column 2024-08-15 10:47:55 +00:00
Robert Schulze
6170a8663f
Bump usearch to 2.13.2 2024-08-14 08:04:00 +00:00
Robert Schulze
fb76cb90b1
Allow un-quoted skip index parameters
Previously, only this syntax to create a skip index worked:

   INDEX index_name column_name TYPE vector_similarity('hnsw', 'L2Distance')

Now, this syntax will work as well:

  INDEX index_name column_name TYPE vector_similarity(hnsw, L2Distance)
2024-08-12 15:32:25 +00:00
Robert Schulze
d2e79f0b92
Rework vector index parameters
USearch (similar to FAISS) allows to specify the distance function,
quantization, and various HNSW meta-parameters for index creation and
sarch. Some users wished for greater configurability, so let's expose
them.

Index creation now requires either
- 2 parameters (with the other 4 parameters taking on default values), or
- 6 parameters for full control

This commit also remove quantization `f64` (that would be upsampling).
2024-08-12 15:32:19 +00:00
Robert Schulze
785b6637fa
Rename index type "usearch" to "vector_similarity"
First, index type "vector_similarity" is more speaking and user-friendly
than "usearch". Second, we should not expose the name of the library
doing the job (usearch). Of course, the docs will continue to mention
usearch (credit where credit is due).

Existing setting `allow_experimental_usearch_index` was marked obsolete.
A new settings `allow_experimental_vector_similarity_index` was added.
2024-08-12 15:30:45 +00:00
Robert Schulze
40bed3e20f
Remove support for WHERE-type queries
These kind of vector search similarity queries are rather obscure and
rare in practice. They require the user to specify a maximum distance
which is not intuitive to obtain. Furthermore, these queries are not
natively supported in USearch, so the vector search index had to emulate
these queries.

Therefore simplifying the code base and restricting vector search to
ORDER-BY queries only.
2024-08-12 15:25:52 +00:00
Robert Schulze
218421c255
Remove Annoy indexes
Annoy indexes fell out of favor in the community, at least when it comes
to vector databases. Such indexes work okay-ish low dimensions but they
suffers badly from a curse of dimensionality which makes them inapt for
a high number of dimensions.

Now that Annoy is gone, issue (*) also disappears and we can drop
'no-ubsan', 'no-cpu-aarch64', and 'no-asan' from tests.

(*) spotify/annoy#456
2024-08-12 15:24:49 +00:00
Robert Schulze
b00c64fe9d
Docs: Remove tuple support from ANN indexes
Indexes for approximate nearest neighbourhood (ANN) search (Annoy,
USearch) can currently be build on columns of type Array(Float32) or
Tuple(Float32[, Float32[, ...]]). In practice, only Arrays are relevant
which makes sense as arrays store high-dimensional embeddings
consecutively and the additional flexibility of different data types in
a tuple is not needed.

Therefore, removing support for ANN indexes over tuple columns to
simplify the code, tests and docs.
2024-05-06 14:18:30 +00:00
Nikolai Fedorovskikh
a98af159b5 [Docs] fix some typos and missing commas 2024-02-13 02:10:41 +01:00
Robert Schulze
10cee14bce
Fix docs 2023-09-25 17:09:58 +00:00
Robert Schulze
3562204850
+ docs 2023-09-25 17:08:37 +00:00
Robert Schulze
c2a593baf9
Split tests 2023-09-21 09:17:38 +00:00
Robert Schulze
945179be46
Annoy: Fix LOGICAL_ERROR with default values #52258 2023-09-14 19:23:32 +00:00
Robert Schulze
e018f1d913
Fix spelling 2023-09-14 12:42:22 +00:00
Michael Kolupaev
8997464867
Small usearch index improvements: metrics and f16 2023-09-14 11:24:47 +00:00
Tian Xinhui
76016d9593
Merge branch 'master' into fix-annoy-index-update 2023-08-22 16:58:29 +08:00
Robert Schulze
066ec559ad
Do not reset Annoy index during build-up with > 1 mark 2023-08-21 10:05:43 +00:00
Robert Schulze
e09098c5a1
Update annindexes.md 2023-08-18 15:01:13 +02:00
Ash Vardanian
9a4254616a
Docs: Extend USearch implementation details 2023-08-18 12:45:34 +01:00
Robert Schulze
ac7a27b7ff
Update docs 2023-08-17 10:03:58 +00:00
Davit Vardanyan
c685ac3df9 Docs: Update USearch docs 2023-08-16 17:31:46 +04:00
Davit Vardanyan
48c62fd75e Add: USearch 2023-08-15 16:00:27 +04:00
Robert Schulze
2d3bf55d45
Docs: Update table name in ANN docs 2023-08-14 08:50:20 +00:00
Robert Schulze
1c3f4d3719
+ , 2023-08-14 07:46:15 +00:00
Robert Schulze
f71ce2641c
Fix copyright issues in ANN docs 2023-08-14 07:36:27 +00:00
Robert Schulze
385332a554
Docs: Update anchors in ANN indexes docs 2023-08-14 07:10:50 +00:00
daviddhc20120601
10662b6425
Update annindexes.md explain more about l2distance and cosine distance
Update annindexes.md explain more about l2distance and cosine distance
2023-08-01 15:35:25 +08:00
Robert Schulze
f4cfb9e370
Fix line break 2023-06-14 07:50:26 +00:00
Robert Schulze
b1f0a91b78
Docs: Fix embedded video link 2023-06-14 07:48:08 +00:00
Robert Schulze
22670aebf8
Merge pull request #50942 from ClickHouse/fix-ann-page
Fixes to ANN docs page
2023-06-13 17:57:00 +02:00
Dan Roscigno
5d1ea9aa18
Merge branch 'master' into fix-ann-page 2023-06-13 11:32:58 -04:00
Dan Roscigno
ecedceea1e
Merge branch 'master' into Fix_heading_sidebar_azureBlobStorage 2023-06-13 11:27:47 -04:00
DanRoscigno
b850f1b999 fix broken line 2023-06-13 11:26:12 -04:00
Dan Roscigno
20ea87e527
Update annindexes.md
Don't break code snippets across lines.
2023-06-13 11:17:33 -04:00
rfraposa
57cdd3a25d Update annindexes.md 2023-06-13 09:13:13 -06:00
Smita Kulkarni
c253c70510 Fix for MDXContent 2023-06-13 16:33:36 +02:00
Robert Schulze
4f39ee51ae
Update Annoy docs 2023-06-12 20:06:57 +00:00
Robert Schulze
b8178088d0
Misc Annoy fixes 2023-06-08 15:06:17 +00:00
Robert Schulze
ce8b39487e
Update docs/en/engines/table-engines/mergetree-family/annindexes.md
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-06-06 09:55:50 +02:00
Robert Schulze
4c88b7bbb7
Further improve ANN index docs 2023-06-05 13:13:49 +00:00
Robert Schulze
660760782a
Rewrite ANN docs 2023-06-05 09:30:55 +00:00
Robert Schulze
a973ac5dbb
Replace weird generic ANN setting by Annoy-specific parameter 2023-06-05 09:30:35 +00:00
Robert Schulze
65cc92a78d
CI: Fix aspell on nested docs 2023-06-02 12:24:41 +00:00
Dan Roscigno
0286b43a73
Fix 404 on approx nearest neighbor page
closes https://github.com/ClickHouse/clickhouse-docs/issues/846
2023-02-28 09:32:06 -05:00
Robert Schulze
8655d1d4d3
Update annindexes.md 2022-11-28 11:23:08 +01:00
Robert Schulze
d587b3f978
Minor fixes in annoy index documentation 2022-11-28 09:09:09 +00:00