Commit Graph

11981 Commits

Author SHA1 Message Date
Robert Schulze
912663b719
Revert "Move CatBoost evaluation into clickhouse-library-bridge" 2022-08-31 20:54:43 +02:00
robot-clickhouse
5309831a1a Update version_date.tsv and changelogs after v22.8.4.7-lts 2022-08-31 14:32:54 +00:00
Robert Schulze
ca01286028
Merge pull request #39629 from ClickHouse/catboost-bridge
Move CatBoost evaluation into clickhouse-library-bridge
2022-08-31 16:16:11 +02:00
DanRoscigno
50f9b12af8 review feedback 2022-08-31 09:12:48 -04:00
Dan Roscigno
ddac1b3f11
Merge pull request #40560 from DanRoscigno/add-backup
Add backup restore docs
2022-08-31 08:19:00 -04:00
Robert Schulze
40468d3304
Fix typo in docs 2022-08-30 20:45:31 +00:00
Denny Crane
0d7cc82267
Update string-functions.md 2022-08-30 11:08:23 -03:00
DanRoscigno
f72b341e8b add status info 2022-08-30 09:34:08 -04:00
Alexander Tokmakov
6fdfb964d0
Revert "Add Annoy index" 2022-08-30 15:10:10 +03:00
Vladimir C
c8c8428052
Apply suggestions from code review 2022-08-30 13:49:21 +02:00
Andrey Zvonov
93f9abf130
upd2 2022-08-30 14:41:40 +03:00
Andrey Zvonov
14adea8792
fix error in docs 2022-08-30 14:40:26 +03:00
Kseniia Sumarokova
c88db2ef97
Merge pull request #40751 from kssenii/fix-mysql-timeouts
Fix issue with mysql db / table function timeouts
2022-08-30 11:59:01 +02:00
Robert Schulze
cc4225109f
Merge pull request #37215 from Vector-Similarity-Search-for-ClickHouse/annoy-2
Test failures are unrelated, merging.
2022-08-30 09:25:57 +02:00
DanRoscigno
d712a91a20 add alternatives 2022-08-29 19:36:20 -04:00
DanRoscigno
0abeebd3ca updated with dev help 2022-08-29 19:29:10 -04:00
Alexey Milovidov
d5fae3d16b
Merge pull request #40752 from ClickHouse/auto/v22.8.3.13-lts
Update version_date.tsv and changelogs after v22.8.3.13-lts
2022-08-30 02:21:55 +03:00
Alexey Milovidov
c4e664a899
Merge pull request #40753 from ClickHouse/auto/v22.7.5.13-stable
Update version_date.tsv and changelogs after v22.7.5.13-stable
2022-08-30 02:21:44 +03:00
Alexey Milovidov
0af90273f9
Merge pull request #40757 from ClickHouse/auto/v22.3.12.19-lts
Update version_date.tsv and changelogs after v22.3.12.19-lts
2022-08-30 02:21:34 +03:00
Alexey Milovidov
bddbbafea0
Merge pull request #40755 from ClickHouse/auto/v22.6.7.7-stable
Update version_date.tsv and changelogs after v22.6.7.7-stable
2022-08-30 02:21:12 +03:00
Alexey Milovidov
6bc8983756
Merge pull request #40771 from den-crane/patch-41
Doc. fix description min_bytes_to_rebalance_partition_over_jbod
2022-08-30 02:20:07 +03:00
Alexey Milovidov
0190c56faf
Merge pull request #40770 from den-crane/patch-40
Doc. Fix cache dictionaries doc.
2022-08-30 02:19:32 +03:00
DanRoscigno
3a65d58c13 updated with dev help 2022-08-29 18:33:26 -04:00
Robert Schulze
64a6aa328e
fix: broken links in documentation (hopefully) 2022-08-29 20:27:06 +00:00
Robert Schulze
4d511332c4
chore: delete obsolete modelEvaluate() function
- superseded by catboostEvaluate() which no longer uses the internal
  repository for external models

- also removed was statement SYSTEM RELOAD MODELS and the monitoring view
  SYSTEM.SYSTEMMODELS
2022-08-29 20:27:06 +00:00
Robert Schulze
6b2b3c1eb3
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-08-29 20:26:45 +00:00
Dan Roscigno
76a45aa750
Merge branch 'master' into add-backup 2022-08-29 16:23:53 -04:00
DanRoscigno
d37029dd82 updates for filename changes 2022-08-29 15:20:28 -04:00
DanRoscigno
576b7ea604 updates for filename changes 2022-08-29 14:39:15 -04:00
Denny Crane
29e7414697
Update merge-tree-settings.md 2022-08-29 15:25:46 -03:00
Denny Crane
19c3a9c6bf
Update external-dicts-dict-layout.md 2022-08-29 15:20:46 -03:00
Denny Crane
fe0f18f21d
Update external-dicts-dict-layout.md 2022-08-29 15:19:15 -03:00
DanRoscigno
687ac1805a updates for filename changes 2022-08-29 13:59:51 -04:00
Kseniia Sumarokova
c5c48e44ea
Merge branch 'master' into fix-mysql-timeouts 2022-08-29 19:33:29 +02:00
DanRoscigno
76a3212fc8 replace symlinks 2022-08-29 12:26:17 -04:00
DanRoscigno
c4b8137d31 replace symlinks 2022-08-29 12:19:50 -04:00
robot-clickhouse
92c14e80f1 Update version_date.tsv and changelogs after v22.3.12.19-lts 2022-08-29 14:52:19 +00:00
robot-clickhouse
57980161c9 Update version_date.tsv and changelogs after v22.6.7.7-stable 2022-08-29 14:44:03 +00:00
Filatenkov Artur
d73f661732
Merge branch 'master' into annoy-2 2022-08-29 17:33:13 +03:00
robot-clickhouse
4a229ad08c Update version_date.tsv and changelogs after v22.7.5.13-stable 2022-08-29 14:29:06 +00:00
kssenii
0a6c4b9265 Fix 2022-08-29 16:20:53 +02:00
robot-clickhouse
764e2e5ac8 Update version_date.tsv and changelogs after v22.8.3.13-lts 2022-08-29 14:05:36 +00:00
alesapin
7ce0afc0df
Merge pull request #40670 from Avogar/kafka
Add setting to disable limit on kafka_num_consumers
2022-08-29 10:53:35 +02:00
DanRoscigno
753afd0584 update links 2022-08-28 20:41:29 -04:00
DanRoscigno
b50fa8b5a9 replace symlinks 2022-08-28 17:34:50 -04:00
DanRoscigno
3c36660488 replace symlinks 2022-08-28 17:27:24 -04:00
DanRoscigno
71891938ae replace symlinks with includes 2022-08-28 14:08:07 -04:00
Dan Roscigno
96cd94196e
Merge branch 'ClickHouse:master' into add-more-slugs 2022-08-28 12:06:37 -04:00
DanRoscigno
fad2e071eb replace symlinks with includes 2022-08-28 11:58:59 -04:00
DanRoscigno
37127c683c remove symlinks 2022-08-28 11:35:03 -04:00