Commit Graph

1105 Commits

Author SHA1 Message Date
Robert Schulze
fb76cb90b1
Allow un-quoted skip index parameters
Previously, only this syntax to create a skip index worked:

   INDEX index_name column_name TYPE vector_similarity('hnsw', 'L2Distance')

Now, this syntax will work as well:

  INDEX index_name column_name TYPE vector_similarity(hnsw, L2Distance)
2024-08-12 15:32:25 +00:00
Robert Schulze
d2e79f0b92
Rework vector index parameters
USearch (similar to FAISS) allows to specify the distance function,
quantization, and various HNSW meta-parameters for index creation and
sarch. Some users wished for greater configurability, so let's expose
them.

Index creation now requires either
- 2 parameters (with the other 4 parameters taking on default values), or
- 6 parameters for full control

This commit also remove quantization `f64` (that would be upsampling).
2024-08-12 15:32:19 +00:00
Robert Schulze
785b6637fa
Rename index type "usearch" to "vector_similarity"
First, index type "vector_similarity" is more speaking and user-friendly
than "usearch". Second, we should not expose the name of the library
doing the job (usearch). Of course, the docs will continue to mention
usearch (credit where credit is due).

Existing setting `allow_experimental_usearch_index` was marked obsolete.
A new settings `allow_experimental_vector_similarity_index` was added.
2024-08-12 15:30:45 +00:00
Robert Schulze
40bed3e20f
Remove support for WHERE-type queries
These kind of vector search similarity queries are rather obscure and
rare in practice. They require the user to specify a maximum distance
which is not intuitive to obtain. Furthermore, these queries are not
natively supported in USearch, so the vector search index had to emulate
these queries.

Therefore simplifying the code base and restricting vector search to
ORDER-BY queries only.
2024-08-12 15:25:52 +00:00
Robert Schulze
218421c255
Remove Annoy indexes
Annoy indexes fell out of favor in the community, at least when it comes
to vector databases. Such indexes work okay-ish low dimensions but they
suffers badly from a curse of dimensionality which makes them inapt for
a high number of dimensions.

Now that Annoy is gone, issue (*) also disappears and we can drop
'no-ubsan', 'no-cpu-aarch64', and 'no-asan' from tests.

(*) spotify/annoy#456
2024-08-12 15:24:49 +00:00
Justin de Guzman
0071765138
Merge pull request #67940 from ClickHouse/prometheus-documentation
Add documentation for Prometheus protocols and TimeSeries engine.
2024-08-09 01:03:26 +00:00
Robert Schulze
076c4a9ce9
Merge pull request #67930 from rschu1ze/fix-stat-assert
Fix stress test error with TDigest statistics
2024-08-08 16:34:58 +00:00
Robert Schulze
37641a0b4b
Merge remote-tracking branch 'ClickHouse/master' into fix-stat-assert 2024-08-08 08:57:22 +00:00
János Benjamin Antal
92be2db5b6 Merge remote-tracking branch 'origin/master' into kafka-zookeeper 2024-08-08 08:01:43 +00:00
Kseniia Sumarokova
315fd5496a
Merge pull request #65386 from skyoct/feat-s3-field
Feat add _etag for object storage
2024-08-07 17:35:43 +00:00
Vitaly Baranov
bf33aabec4 Add documentation.
(cherry picked from commit 083fff6ed6)
2024-08-06 20:15:51 +02:00
Robert Schulze
d09c82ff76
Cosmetics II 2024-08-06 12:36:09 +00:00
János Benjamin Antal
7aff8748b0 Address small review comments 2024-07-31 18:08:19 +00:00
János Benjamin Antal
23fa85e3ff
Apply suggestions from code review
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2024-07-31 19:30:58 +02:00
János Benjamin Antal
b9670c782f Merge remote-tracking branch 'origin/master' into kafka-zookeeper 2024-07-29 14:42:41 +00:00
Shri Bodas
d0c4c4151c
Update keepermap.md
Needs quotes around keeper path
2024-07-25 14:24:28 -07:00
JackyWoo
245359e536 Merge branch 'master' into add_statistics_cmsketch 2024-07-16 09:45:31 +08:00
János Benjamin Antal
1ecfba837e Rename experimental flag to allow_experimental_kafka_offsets_storage_in_keeper 2024-07-15 09:03:05 +00:00
János Benjamin Antal
4f98df7f49 Merge remote-tracking branch 'origin/master' into kafka-zookeeper 2024-07-15 08:32:29 +00:00
János Benjamin Antal
b5b944b4e6
Improve wording of docs based on review comments
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2024-07-15 10:15:32 +02:00
Alexander Gololobov
74cc20b286
Make spellcheck happy 2024-07-10 12:18:50 +02:00
Alexander Gololobov
56c751a10a
Update docs/en/engines/table-engines/integrations/s3queue.md
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2024-07-10 12:17:39 +02:00
Alexander Gololobov
73c4eaa0f2
Clarify ordered mode description for s3Queue 2024-07-10 12:09:32 +02:00
Kseniia Sumarokova
292113e32b
Merge branch 'master' into feat-s3-field 2024-07-04 12:21:33 +02:00
JackyWoo
9036ce9725 Some fixups after merging 2024-07-04 15:38:33 +08:00
JackyWoo
0c5821e5b8 Merge branch 'master' into add_statistics_cmsketch
# Conflicts:
#	docs/en/engines/table-engines/mergetree-family/mergetree.md
#	src/Storages/Statistics/Statistics.cpp
#	src/Storages/Statistics/Statistics.h
#	src/Storages/Statistics/StatisticsTDigest.h
#	src/Storages/Statistics/StatisticsUniq.h
#	src/Storages/Statistics/TDigestStatistics.cpp
#	tests/queries/0_stateless/02864_statistics_uniq.sql
2024-07-04 10:25:53 +08:00
Robert Schulze
2cefa56f9b
Update docs 2024-07-03 10:13:15 +00:00
Nathan Clevenger
7a5bf2fc24
Fixed types
Replaced RabbitMQ with NATS
2024-07-01 15:14:37 -05:00
Smita Kulkarni
1f768f4dd7 Basic docs for azure blob storage authentication 2024-06-28 21:54:33 +02:00
Robert Schulze
cc67efd789
Some fixups 2024-06-26 12:39:50 +00:00
Kseniia Sumarokova
d6a351bf0d
Merge branch 'master' into feat-s3-field 2024-06-26 13:42:11 +02:00
kssenii
2ba697dcae Merge remote-tracking branch 'origin' into add-azure-queue-storage 2024-06-26 13:05:56 +02:00
János Benjamin Antal
8cc25827ed Extend known limitations 2024-06-24 08:30:37 +00:00
kssenii
5904847316 Fix tests 2024-06-20 18:32:00 +02:00
skyoct
7523d8b1aa Feat add docs 2024-06-19 21:24:26 +08:00
Kseniia Sumarokova
8827fc7c21
Merge pull request #65430 from allegrinisante/patch-1
Mode value = 'unordered' may lead to confusion
2024-06-19 11:20:02 +00:00
Raúl Marín
02f1946fc0
Merge pull request #64820 from lsj4401/patch-1
fix typo
2024-06-19 10:42:40 +00:00
allegrinisante
619333b356
Mode value = 'unordered' may lead to confusion
According to the documentation, the default mode was 'ordered' before version 24.6. Starting from version 24.6, there is no default value for mode. Using mode = 'unordered' can be confusing.
2024-06-19 11:44:38 +02:00
János Benjamin Antal
a8ba4f3d1b
Merge branch 'master' into kafka-zookeeper 2024-06-18 23:07:25 +02:00
János Benjamin Antal
29546d1655 Add minimal docs 2024-06-18 20:17:32 +00:00
Kseniia Sumarokova
3f0211a2f8
Update s3queue.md 2024-06-13 13:49:47 +02:00
Kseniia Sumarokova
00e58f522f
Update s3queue.md 2024-06-11 10:35:55 +02:00
Kseniia Sumarokova
f27e92c97b
Update s3queue.md 2024-06-11 10:34:19 +02:00
Kseniia Sumarokova
9a55cdf77c
Merge pull request #64947 from ilejn/time_virtual_col
Add _time virtual column to file alike storages
2024-06-10 12:14:46 +00:00
Robert Schulze
d59a170144
Docs for MergeTree: Capitalized SETTINGS 2024-06-10 07:05:36 +00:00
Robert Schulze
cdd2957a31
Move MergeTree setting docs into MergeTree settings docs page 2024-06-09 19:09:33 +00:00
Nikita Mikhaylov
756ac16170 Merge branch 'master' of github.com:ClickHouse/ClickHouse into show-create-system-tables 2024-06-07 18:00:33 +02:00
Ilya Golshtein
258b1f9559 time_virtual_col: tests, doc, small refactoring 2024-06-06 21:00:47 +00:00
Han Fei
ac430bb01d
Merge pull request #59357 from hanfei1991/hanfei/stats_uniq
Introduce statistics of type "number of distinct values"
2024-06-05 12:56:52 +00:00
Yarik Briukhovetskyi
fbe29c0737
remove trailing whitespace 2024-06-05 13:16:38 +02:00