Robert Schulze
2ffcc97af2
Merge pull request #63675 from rschu1ze/vector-search
...
Initial implementation of vector similarity index
2024-08-13 15:06:20 +00:00
Yakov Olkhovskiy
3e8a177622
Merge pull request #61908 from ClickHouse/ci-fuzzer-enable
...
CI: enable libfuzzer (fixing build and docker)
2024-08-13 14:22:09 +00:00
Yarik Briukhovetskyi
39c25663ae
Merge pull request #67879 from bigo-sg/opt_orc_writer
...
Avoid allocating unnecessary capacity for array column while writing orc & some minor refactors
2024-08-13 12:51:11 +00:00
Robert Schulze
99282e526a
Merge pull request #68235 from sakulali/query_cache_tag
...
QueryCache: Add tagging
2024-08-13 10:44:10 +00:00
Yarik Briukhovetskyi
086c0f03a6
Merge pull request #65997 from yariks5s/hive_style_partitioning
...
Implementing Hive-style partitioning
2024-08-13 10:04:21 +00:00
vdimir
dfb892ba5f
Merge pull request #66616 from Blargian/docs_getXYZ
...
add documentation for `getSubcolumn` and `getTypeSerializationStreams`
2024-08-13 09:35:44 +00:00
Pablo Marcos
d28e2d7546
Merge pull request #68203 from pamarcos/fix-test-01903_correct_block_size_prediction_with_default
...
[Green CI] Fix test 01903_correct_block_size_prediction_with_default
2024-08-13 08:11:35 +00:00
pufit
ae5223854f
Merge pull request #67653 from ClickHouse/pufit/inconsistent-formating-grant-current-grants
...
Fix inconsistent formatting for `GRANT CURRENT GRANTS`
2024-08-13 03:21:26 +00:00
Yakov Olkhovskiy
a517bc90cd
Update PULL_REQUEST_TEMPLATE.md
2024-08-12 21:42:47 -04:00
Alexey Milovidov
203857020f
Merge pull request #68178 from ClickHouse/fix-68177
...
Fix `test_cluster_all_replicas`
2024-08-12 21:52:13 +00:00
János Benjamin Antal
ac6826392d
Merge pull request #67554 from ClickHouse/fix-message-queue-sink-from-http-interface
...
Fix message queue sink from http interface
2024-08-12 21:29:14 +00:00
János Benjamin Antal
6eb4a71ad3
Merge pull request #68163 from azat/backups-processes
...
[RFC] Fix settings/current_database in system.processes for async BACKUP/RESTORE
2024-08-12 21:07:55 +00:00
János Benjamin Antal
eaa5715a02
Merge pull request #68200 from ClickHouse/remove-log-engine-from-kafka-integration-tests
...
Remove Log engine from Kafka integration tests
2024-08-12 20:44:53 +00:00
Han Fei
40382451a2
Merge pull request #68186 from rschu1ze/stats-tests-refactoring
...
Refactor tests for (experimental) statistics
2024-08-12 18:58:19 +00:00
Robert Schulze
45a14fa0ce
Fix spelling
2024-08-12 18:54:06 +00:00
Robert Schulze
d03b354550
Merge pull request #67964 from rschu1ze/multiquery-followup-new2
...
Remove obsolete `--multiquery` parameter (follow-up to #63898 ), pt. III
2024-08-12 18:42:53 +00:00
Shaun Struwig
eab8594570
Update aspell-dict.txt
2024-08-12 20:35:33 +02:00
Shaun Struwig
aa7a2bcb02
Fix typo
2024-08-12 20:34:02 +02:00
Robert Schulze
c22265b889
Some fixups
2024-08-12 17:45:38 +00:00
Kruglov Pavel
ba85cc8d59
Merge pull request #67043 from Avogar/improve-squashing
...
Improve columns squashing for String/Array/Map/Variant/Dynamic types
2024-08-12 17:14:15 +00:00
Antonio Andelic
ea9b7d4c27
Merge pull request #67998 from ClickHouse/minio-audit-logs
...
Collect minio audit logs in stateless tests
2024-08-12 17:03:17 +00:00
Alexey Milovidov
74fb08198c
Merge branch 'master' into fix-68177
2024-08-12 18:39:06 +02:00
Alexey Milovidov
58ed71bf11
Merge pull request #68181 from ClickHouse/fix-leftovers
...
Fix leftovers
2024-08-12 16:22:17 +00:00
Alexey Milovidov
c95a40cdf0
Merge pull request #68182 from ClickHouse/fix-transactions
...
Fix test `01172_transaction_counters`
2024-08-12 16:20:26 +00:00
Pablo Marcos
858b7e55d0
Improve condition in case the default column consumes slightly more memory
...
It never happened in the few hundreds of tests I ran successfully,
but we'd rather be safe than sorry.
2024-08-12 16:16:57 +00:00
Robert Schulze
fe537045c9
Merge remote-tracking branch 'ClickHouse/master' into query_cache_tag
2024-08-12 16:16:32 +00:00
Yarik Briukhovetskyi
3a6e05eb43
try to fix includes
2024-08-12 18:03:42 +02:00
Yarik Briukhovetskyi
ea1cd66575
fix tidy
2024-08-12 17:32:43 +02:00
Robert Schulze
fb76cb90b1
Allow un-quoted skip index parameters
...
Previously, only this syntax to create a skip index worked:
INDEX index_name column_name TYPE vector_similarity('hnsw', 'L2Distance')
Now, this syntax will work as well:
INDEX index_name column_name TYPE vector_similarity(hnsw, L2Distance)
2024-08-12 15:32:25 +00:00
Robert Schulze
d2e79f0b92
Rework vector index parameters
...
USearch (similar to FAISS) allows to specify the distance function,
quantization, and various HNSW meta-parameters for index creation and
sarch. Some users wished for greater configurability, so let's expose
them.
Index creation now requires either
- 2 parameters (with the other 4 parameters taking on default values), or
- 6 parameters for full control
This commit also remove quantization `f64` (that would be upsampling).
2024-08-12 15:32:19 +00:00
Robert Schulze
cc5c64e1ed
Add migration helper for legacy 'annoy' and 'usearch' indexes types
...
Index types 'annoy' and 'usearch' were removed and replaced by
'vector_similarity' indexes in an earlier commit.
This means unfortuantely, that if customers have tables with these
indexes and upgrade, their database might not start anymore - the
system loads the metadata at startup, thinks something is wrong with
such tables, and halts immediately.
This commit adds support for loading and attaching such indexes back.
Data insert or use (search) return an error which recommends a migration
to 'vector_similarity' indexes. The implementation is generally similar
to what has recently been implemented for 'full_text' indexes [1, 2].
[1] https://github.com/ClickHouse/ClickHouse/pull/64656
[2] https://github.com/ClickHouse/ClickHouse/pull/64846
2024-08-12 15:31:27 +00:00
Robert Schulze
785b6637fa
Rename index type "usearch" to "vector_similarity"
...
First, index type "vector_similarity" is more speaking and user-friendly
than "usearch". Second, we should not expose the name of the library
doing the job (usearch). Of course, the docs will continue to mention
usearch (credit where credit is due).
Existing setting `allow_experimental_usearch_index` was marked obsolete.
A new settings `allow_experimental_vector_similarity_index` was added.
2024-08-12 15:30:45 +00:00
Robert Schulze
021fad920e
Cosmetics: minor stuff
2024-08-12 15:30:41 +00:00
Robert Schulze
2aa037985b
Cosmetics: simplify inheritance hierarchy
2024-08-12 15:30:38 +00:00
Robert Schulze
901906159d
Cosmetics: ApproximateNearestNeighborInformation --> Info + nest in class
2024-08-12 15:30:35 +00:00
Robert Schulze
6170aad43e
Cosmetics: ApproximateNearestNeighborIndexesCommon --> VectorSimilarityCondition
2024-08-12 15:30:30 +00:00
Robert Schulze
e20eff635e
Cosmetics: variable naming
2024-08-12 15:30:27 +00:00
Robert Schulze
1bf320a1a8
Cosmetics: metric --> distance_function (for consistent terminology)
2024-08-12 15:30:24 +00:00
Robert Schulze
3f47b42d71
Remove funny typedef
2024-08-12 15:30:21 +00:00
Robert Schulze
fb26a9e6d4
Cosmetics: whitespaces
2024-08-12 15:30:18 +00:00
Robert Schulze
0f1765a273
Cosmetics: function naming
2024-08-12 15:30:14 +00:00
Robert Schulze
a8167abca2
Cosmetics: use native types/functions
2024-08-12 15:30:10 +00:00
Robert Schulze
9ad890e399
Cosmetics: whitespaces
2024-08-12 15:30:07 +00:00
Robert Schulze
27a6931a35
Cosmetics: variable naming
2024-08-12 15:29:59 +00:00
Robert Schulze
289c27c804
Introduce version for for index files in persistence
2024-08-12 15:29:02 +00:00
Robert Schulze
4ad624cb7e
Cosmetics
2024-08-12 15:28:58 +00:00
Robert Schulze
74de79e52b
Addd logging of basic statistics
2024-08-12 15:28:46 +00:00
Robert Schulze
8853b3359b
Remove useless templatization
...
Makes the code cleaner, compile faster, and the binary smaller.
2024-08-12 15:27:06 +00:00
Robert Schulze
4f23f7754b
Cosmetics
2024-08-12 15:26:05 +00:00
Robert Schulze
7f611681df
Add a similar sanity check as in other skipping indexes
2024-08-12 15:26:01 +00:00