Previously, only this syntax to create a skip index worked:
INDEX index_name column_name TYPE vector_similarity('hnsw', 'L2Distance')
Now, this syntax will work as well:
INDEX index_name column_name TYPE vector_similarity(hnsw, L2Distance)
USearch (similar to FAISS) allows to specify the distance function,
quantization, and various HNSW meta-parameters for index creation and
sarch. Some users wished for greater configurability, so let's expose
them.
Index creation now requires either
- 2 parameters (with the other 4 parameters taking on default values), or
- 6 parameters for full control
This commit also remove quantization `f64` (that would be upsampling).
First, index type "vector_similarity" is more speaking and user-friendly
than "usearch". Second, we should not expose the name of the library
doing the job (usearch). Of course, the docs will continue to mention
usearch (credit where credit is due).
Existing setting `allow_experimental_usearch_index` was marked obsolete.
A new settings `allow_experimental_vector_similarity_index` was added.
These kind of vector search similarity queries are rather obscure and
rare in practice. They require the user to specify a maximum distance
which is not intuitive to obtain. Furthermore, these queries are not
natively supported in USearch, so the vector search index had to emulate
these queries.
Therefore simplifying the code base and restricting vector search to
ORDER-BY queries only.
Annoy indexes fell out of favor in the community, at least when it comes
to vector databases. Such indexes work okay-ish low dimensions but they
suffers badly from a curse of dimensionality which makes them inapt for
a high number of dimensions.
Now that Annoy is gone, issue (*) also disappears and we can drop
'no-ubsan', 'no-cpu-aarch64', and 'no-asan' from tests.
(*) spotify/annoy#456
According to the documentation, the default mode was 'ordered' before version 24.6. Starting from version 24.6, there is no default value for mode. Using mode = 'unordered' can be confusing.