Commit Graph

522 Commits

Author SHA1 Message Date
Robert Schulze
46a3e3e795
Docs: Update vector search docs 2024-11-25 12:37:43 +00:00
kellytoole
cad22e7a2d Documenting MergeTree's ttl_only_drop_parts setting, and updating reference to it. 2024-11-18 15:06:08 -08:00
Linh Giang
864f35dd11
Update aggregatingmergetree.md to include video
Added relevant video to the page.
2024-11-15 11:30:01 -07:00
Robert Schulze
e7ad525e00
Re-introduce support for legacy index creation syntax 2024-11-07 10:44:00 +00:00
Raúl Marín
88dec86bd4
Update replacingmergetree.md 2024-10-25 23:08:58 +02:00
Dale Mcdiarmid
ddd6eea267 call out guide 2024-10-25 17:56:49 +01:00
Dale Mcdiarmid
67ba0433d9 fix comment 2024-10-25 17:52:30 +01:00
Dale Mcdiarmid
2b9d59c086 note on final + link to guide 2024-10-25 17:05:05 +01:00
Diana Carroll
b6c959846d
Minor cleanup of aggregatingmergetree.md
Fix several small typos and formatting inconsistencies, and clean up wording.
2024-10-24 14:59:03 -04:00
Robert Schulze
5f94239f99
Update default HNSW parameter settings 2024-10-21 07:30:25 +00:00
Robert Schulze
c69cd58ec8
Docs cosmetics 2024-10-17 12:16:46 +00:00
Robert Schulze
4c3de5ceb4
Update docs 2024-10-17 12:15:14 +00:00
Robert Schulze
d8ea2198b7
Better 2024-10-14 17:50:47 +00:00
Robert Schulze
395ad883af
Better 2024-10-14 17:35:42 +00:00
Robert Schulze
b701a94750
Query-time ef_search 2024-10-14 09:04:24 +00:00
alsu
d680e52d00 fix some typos 2024-10-04 16:10:06 +02:00
Denny Crane
a05273ef02 fix broken link 2024-09-22 12:02:46 -03:00
Robert Schulze
38b5ea9066
Fix docs 2024-09-12 12:43:27 +00:00
Robert Schulze
fe5e061fff
Some fixups 2024-09-12 10:38:14 +00:00
Robert Schulze
c4720d9728
Minor doc fixups 2024-09-09 11:18:42 +00:00
JackyWoo
6539cbd1ce Add CountMin to dictionary file and a little fixup 2024-09-09 17:34:15 +08:00
JackyWoo
eb800d9e39 Rename statistics of type count_min to countmin 2024-09-09 11:45:39 +08:00
Robert Schulze
a73eb1c177
Merge pull request #67013 from JackyWoo/add_statistics_minmax
Add `min_max` statistics type
2024-09-06 16:14:04 +00:00
JackyWoo
3974e9060a Fix docs and some fixups 2024-09-06 18:50:02 +08:00
Robert Schulze
7a98f7fecc
Add testcase for ANN index usage with subquery 2024-09-04 10:21:19 +00:00
JackyWoo
a9b6c04705 Merge branch 'master' into add_statistics_minmax
# Conflicts:
#	src/Storages/Statistics/Statistics.cpp
2024-08-29 14:35:33 +08:00
Robert Schulze
9fb4c23c06
Merge pull request #68678 from rschu1ze/usearch-2.14
Vector similarity index: make `bf16` the default quantization
2024-08-28 08:45:02 +00:00
Robert Schulze
c40c8b7adb
Enable bf16 + f64 quantization, make bf16 the default 2024-08-23 07:32:34 +00:00
JackyWoo
2502ca766f Merge branch 'master' into add_statistics_minmax
# Conflicts:
#	src/Storages/Statistics/ConditionSelectivityEstimator.cpp
#	src/Storages/Statistics/Statistics.cpp
#	src/Storages/Statistics/Statistics.h
#	src/Storages/Statistics/StatisticsCountMinSketch.cpp
#	src/Storages/Statistics/StatisticsCountMinSketch.h
#	src/Storages/Statistics/StatisticsTDigest.cpp
#	src/Storages/Statistics/StatisticsTDigest.h
#	src/Storages/Statistics/StatisticsUniq.cpp
#	src/Storages/Statistics/StatisticsUniq.h
2024-08-21 10:56:23 +08:00
leonkozlowski
e416a2b3d2 patch: fix reference to sorting key in primary key docs 2024-08-20 09:42:19 -04:00
Robert Schulze
38a2b0dcc7
Allow Array(Float64) as type of underlying column 2024-08-15 10:47:55 +00:00
Robert Schulze
6170a8663f
Bump usearch to 2.13.2 2024-08-14 08:04:00 +00:00
Robert Schulze
fb76cb90b1
Allow un-quoted skip index parameters
Previously, only this syntax to create a skip index worked:

   INDEX index_name column_name TYPE vector_similarity('hnsw', 'L2Distance')

Now, this syntax will work as well:

  INDEX index_name column_name TYPE vector_similarity(hnsw, L2Distance)
2024-08-12 15:32:25 +00:00
Robert Schulze
d2e79f0b92
Rework vector index parameters
USearch (similar to FAISS) allows to specify the distance function,
quantization, and various HNSW meta-parameters for index creation and
sarch. Some users wished for greater configurability, so let's expose
them.

Index creation now requires either
- 2 parameters (with the other 4 parameters taking on default values), or
- 6 parameters for full control

This commit also remove quantization `f64` (that would be upsampling).
2024-08-12 15:32:19 +00:00
Robert Schulze
785b6637fa
Rename index type "usearch" to "vector_similarity"
First, index type "vector_similarity" is more speaking and user-friendly
than "usearch". Second, we should not expose the name of the library
doing the job (usearch). Of course, the docs will continue to mention
usearch (credit where credit is due).

Existing setting `allow_experimental_usearch_index` was marked obsolete.
A new settings `allow_experimental_vector_similarity_index` was added.
2024-08-12 15:30:45 +00:00
Robert Schulze
40bed3e20f
Remove support for WHERE-type queries
These kind of vector search similarity queries are rather obscure and
rare in practice. They require the user to specify a maximum distance
which is not intuitive to obtain. Furthermore, these queries are not
natively supported in USearch, so the vector search index had to emulate
these queries.

Therefore simplifying the code base and restricting vector search to
ORDER-BY queries only.
2024-08-12 15:25:52 +00:00
Robert Schulze
218421c255
Remove Annoy indexes
Annoy indexes fell out of favor in the community, at least when it comes
to vector databases. Such indexes work okay-ish low dimensions but they
suffers badly from a curse of dimensionality which makes them inapt for
a high number of dimensions.

Now that Annoy is gone, issue (*) also disappears and we can drop
'no-ubsan', 'no-cpu-aarch64', and 'no-asan' from tests.

(*) spotify/annoy#456
2024-08-12 15:24:49 +00:00
Robert Schulze
d09c82ff76
Cosmetics II 2024-08-06 12:36:09 +00:00
JackyWoo
8259a9827e Update reference file for tests 2024-08-06 10:29:42 +08:00
JackyWoo
46da03030c Add test for implicitly type conversion 2024-08-06 10:05:45 +08:00
JackyWoo
4fa30da118 Fix docs 2024-08-06 09:56:38 +08:00
JackyWoo
a36424fc8c Add supported data types to documents 2024-08-05 18:59:09 +08:00
Robert Schulze
7765ff6d52
Minor fixups 2024-08-05 07:51:58 +00:00
JackyWoo
d1305d9fad Some fixups and split tests 2024-08-05 11:26:57 +08:00
JackyWoo
5ae356e6df Add document for min_max statistics 2024-07-24 17:54:48 +08:00
JackyWoo
9036ce9725 Some fixups after merging 2024-07-04 15:38:33 +08:00
JackyWoo
0c5821e5b8 Merge branch 'master' into add_statistics_cmsketch
# Conflicts:
#	docs/en/engines/table-engines/mergetree-family/mergetree.md
#	src/Storages/Statistics/Statistics.cpp
#	src/Storages/Statistics/Statistics.h
#	src/Storages/Statistics/StatisticsTDigest.h
#	src/Storages/Statistics/StatisticsUniq.h
#	src/Storages/Statistics/TDigestStatistics.cpp
#	tests/queries/0_stateless/02864_statistics_uniq.sql
2024-07-04 10:25:53 +08:00
Robert Schulze
2cefa56f9b
Update docs 2024-07-03 10:13:15 +00:00
Robert Schulze
cc67efd789
Some fixups 2024-06-26 12:39:50 +00:00
Robert Schulze
d59a170144
Docs for MergeTree: Capitalized SETTINGS 2024-06-10 07:05:36 +00:00