Commit Graph

12091 Commits

Author SHA1 Message Date
Robert Schulze
60f9f6855d
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-09-08 09:01:32 +00:00
Alexey Milovidov
9544b8fdd6
Merge pull request #40996 from ClickHouse/vdimir/issue-40994
Minor update doc for mysql_port
2022-09-08 02:39:12 +03:00
Alexey Milovidov
84a00e3992
Merge pull request #41087 from peter279k/improve_clickhouse_start
Improve clickhouse start command
2022-09-08 02:35:02 +03:00
Kseniia Sumarokova
eb53df48d1
Update storing-data.md 2022-09-07 22:26:52 +02:00
Kseniia Sumarokova
3af51f4340
Update storing-data.md 2022-09-07 22:21:46 +02:00
Denny Crane
a75eb5ad84
Update date-time-functions.md 2022-09-07 15:59:23 -03:00
Denny Crane
0071ef9e38
Update date-time-functions.md 2022-09-07 15:56:31 -03:00
peter279k
945299de99 Remove strange release trains 2022-09-08 01:24:12 +08:00
peter279k
1ae54d3d16 Improve clickhouse start command 2022-09-08 01:18:27 +08:00
Kseniia Sumarokova
4cb07bd48d
Merge branch 'master' into add-documentation-for-cache 2022-09-07 17:52:53 +02:00
Kseniia Sumarokova
6f06633df6
Update storing-data.md 2022-09-07 13:59:39 +02:00
vdimir
000ba8a60d
Use insert_qurum = auto instead of majority_insert_quorum 2022-09-07 11:19:27 +00:00
Sachin
ade4337978
Add majority_insert_quorum setting
majority_insert_quorum is defined as (number_of_replicas/2)+1. Insert
will be successful only if majority of quorum have applied it. If
insert_quorum and majority_insert_quorum both are specified, max of
both will be used.
2022-09-07 11:19:24 +00:00
kssenii
613c688331 Add documentation for cache 2022-09-07 13:12:19 +02:00
peter279k
b716988991 Remove non-existed released trains 2022-09-07 18:39:38 +08:00
DanRoscigno
3073da9ba5 move doc 2022-09-06 18:39:06 -04:00
DanRoscigno
7032a1b267 move title to frontmatter 2022-09-06 11:14:55 -04:00
Kruglov Pavel
582216a3ca
Merge pull request #39919 from pzhdfy/UniqSketch
UniqThetaSketch support set operation such as union/intersect/not
2022-09-05 13:42:14 +02:00
Vladimir C
8c3c3e7667
Minor update doc for mysql_port 2022-09-05 12:39:39 +02:00
Antonio Andelic
3a0581e990
Merge pull request #40543 from Lloyd-Pottiger/feat/support-read-only-for-embeddedrocksdb
Add read-only support for EmbeddedRocksDB
2022-09-05 09:31:07 +02:00
Alexey Milovidov
8ea7b9c978
Merge pull request #40980 from ClickHouse/alexey-milovidov-patch-5
Update replicated.md
2022-09-05 03:18:25 +03:00
Alexey Milovidov
3071e80fbc
Merge branch 'master' into alexey-milovidov-patch-4 2022-09-05 03:01:55 +03:00
Alexey Milovidov
f8e72eb7cb
Update replicated.md 2022-09-05 01:02:42 +03:00
Alexey Milovidov
053a2186b5
Update replicated.md 2022-09-05 01:01:23 +03:00
Alexey Milovidov
6a7593ec53
Merge branch 'master' into alexey-milovidov-patch-4 2022-09-04 20:30:15 +03:00
Alexey Milovidov
c4adc9ed8f Remove trash 2022-09-04 04:28:08 +02:00
Alexey Milovidov
deeaea004d
Merge pull request #40295 from cyber-moon/patch-1
Add description of {condition}-keyword
2022-09-04 04:57:30 +03:00
Alexey Milovidov
7c3c367e01
Update tips.md 2022-09-04 04:18:57 +03:00
Alexey Milovidov
001733b6d0
Merge pull request #40921 from FrankChen021/mac_doc
Update doc about building on macOS
2022-09-04 04:05:34 +03:00
Alexey Milovidov
46f40c0c41
Merge pull request #40949 from AVMusorin/doc_generate_random_fix
Doc. optional params for GenerateRandom table
2022-09-04 03:54:51 +03:00
Alexey Milovidov
1963baa11e
Merge pull request #40964 from ClickHouse/fix-docs-formatting
Fix formatting of notes box in documentation
2022-09-04 03:39:39 +03:00
Denny Crane
dd19b0856e
Doc. mapApply, mapFilter, mapUpdate (#40961)
* Update tuple-map-functions.md

* Update tuple-map-functions.md
2022-09-04 00:43:39 +02:00
Denny Crane
f2de8ff8ff
Doc. commpressions http.md (#40959) 2022-09-04 00:42:47 +02:00
Robert Schulze
385427ade8
Fix formatting of notes box in documentation
Follow-up to PR #38435
2022-09-03 17:25:14 +00:00
Robert Schulze
5cff329439
Merge branch 'master' into typo-role-grants 2022-09-03 18:32:41 +02:00
Alexey Milovidov
1768a82a53 git checkout c4b8137d31 docs/en/getting-started/example-datasets/uk-price-paid.md 2022-09-03 06:45:18 +02:00
Aleksandr Musorin
69b9d34b10
docs. optional params for GenerateRandom table 2022-09-02 17:25:10 +02:00
Frank Chen
0f3003e37b Add ccache brew install list 2022-09-02 16:01:31 +02:00
Robert Schulze
53836bbeeb
Fix typo
The system view is called 'role*_*grants', documented on page
'role*-*grants.md'.
2022-09-02 12:49:50 +00:00
Lloyd-Pottiger
5a6b2106b5 Merge branch 'master' of github.com:ClickHouse/ClickHouse into feat/support-read-only-for-embeddedrocksdb 2022-09-02 18:10:00 +08:00
Lloyd-Pottiger
59dccd6e49 fix test
Signed-off-by: Lloyd-Pottiger <yan1579196623@gamil.com>
2022-09-02 11:14:49 +08:00
DanRoscigno
6efd07cbc8 correct statement 2022-09-01 20:32:06 -04:00
DanRoscigno
662eed214f correct statements 2022-09-01 20:28:25 -04:00
Dan Roscigno
a733381821
Merge branch 'master' into update-nav 2022-09-01 16:56:43 -04:00
DanRoscigno
3e9225aafa move title to frontmatter, remove orig article link 2022-09-01 12:24:03 -04:00
Ivan Blinkov
a947487afe
[docs] fix link markdown 2022-09-01 18:16:13 +03:00
Александр
97e55ca2f5
Merge branch 'master' into docs_table_function_update 2022-09-01 14:01:32 +02:00
Dan Roscigno
29ac78a92b
Update uk price paid (#40828) 2022-09-01 13:32:33 +02:00
Fangyuan Deng
bc7d661668
Merge branch 'master' into UniqSketch 2022-09-01 19:31:53 +08:00
Denny Crane
e3af5a7a11
Doc. Added ON CLUSTER cluster in couple places (#40874) 2022-09-01 13:31:22 +02:00
pzhdfy
acec516271 add docs 2022-09-01 19:31:01 +08:00
Aleksandr Musorin
310e1484f6
docs - updated optional parameters for table functions 2022-09-01 13:28:56 +02:00
MaceWindu
21ab72365a
Update integrations.md
Add linq2db to list of third-party libraries
2022-09-01 11:39:27 +02:00
Robert Schulze
56eece40ec
Merge pull request #40736 from LevyCory/add-offset-to-formatDateTime
Add timezone offset support  to `formatDateTime`
2022-09-01 09:50:17 +02:00
Robert Schulze
912663b719
Revert "Move CatBoost evaluation into clickhouse-library-bridge" 2022-08-31 20:54:43 +02:00
robot-clickhouse
5309831a1a Update version_date.tsv and changelogs after v22.8.4.7-lts 2022-08-31 14:32:54 +00:00
Robert Schulze
ca01286028
Merge pull request #39629 from ClickHouse/catboost-bridge
Move CatBoost evaluation into clickhouse-library-bridge
2022-08-31 16:16:11 +02:00
DanRoscigno
50f9b12af8 review feedback 2022-08-31 09:12:48 -04:00
Dan Roscigno
ddac1b3f11
Merge pull request #40560 from DanRoscigno/add-backup
Add backup restore docs
2022-08-31 08:19:00 -04:00
Robert Schulze
40468d3304
Fix typo in docs 2022-08-30 20:45:31 +00:00
Denny Crane
0d7cc82267
Update string-functions.md 2022-08-30 11:08:23 -03:00
DanRoscigno
f72b341e8b add status info 2022-08-30 09:34:08 -04:00
Alexander Tokmakov
6fdfb964d0
Revert "Add Annoy index" 2022-08-30 15:10:10 +03:00
Vladimir C
c8c8428052
Apply suggestions from code review 2022-08-30 13:49:21 +02:00
Andrey Zvonov
93f9abf130
upd2 2022-08-30 14:41:40 +03:00
Andrey Zvonov
14adea8792
fix error in docs 2022-08-30 14:40:26 +03:00
Kseniia Sumarokova
c88db2ef97
Merge pull request #40751 from kssenii/fix-mysql-timeouts
Fix issue with mysql db / table function timeouts
2022-08-30 11:59:01 +02:00
Robert Schulze
cc4225109f
Merge pull request #37215 from Vector-Similarity-Search-for-ClickHouse/annoy-2
Test failures are unrelated, merging.
2022-08-30 09:25:57 +02:00
DanRoscigno
d712a91a20 add alternatives 2022-08-29 19:36:20 -04:00
DanRoscigno
0abeebd3ca updated with dev help 2022-08-29 19:29:10 -04:00
Alexey Milovidov
d5fae3d16b
Merge pull request #40752 from ClickHouse/auto/v22.8.3.13-lts
Update version_date.tsv and changelogs after v22.8.3.13-lts
2022-08-30 02:21:55 +03:00
Alexey Milovidov
c4e664a899
Merge pull request #40753 from ClickHouse/auto/v22.7.5.13-stable
Update version_date.tsv and changelogs after v22.7.5.13-stable
2022-08-30 02:21:44 +03:00
Alexey Milovidov
0af90273f9
Merge pull request #40757 from ClickHouse/auto/v22.3.12.19-lts
Update version_date.tsv and changelogs after v22.3.12.19-lts
2022-08-30 02:21:34 +03:00
Alexey Milovidov
bddbbafea0
Merge pull request #40755 from ClickHouse/auto/v22.6.7.7-stable
Update version_date.tsv and changelogs after v22.6.7.7-stable
2022-08-30 02:21:12 +03:00
Alexey Milovidov
6bc8983756
Merge pull request #40771 from den-crane/patch-41
Doc. fix description min_bytes_to_rebalance_partition_over_jbod
2022-08-30 02:20:07 +03:00
Alexey Milovidov
0190c56faf
Merge pull request #40770 from den-crane/patch-40
Doc. Fix cache dictionaries doc.
2022-08-30 02:19:32 +03:00
DanRoscigno
3a65d58c13 updated with dev help 2022-08-29 18:33:26 -04:00
Cory Levy
1e2eee7146 Remove unnecessary backslashes in markdown sql blocks 2022-08-29 16:31:31 -04:00
Robert Schulze
64a6aa328e
fix: broken links in documentation (hopefully) 2022-08-29 20:27:06 +00:00
Robert Schulze
4d511332c4
chore: delete obsolete modelEvaluate() function
- superseded by catboostEvaluate() which no longer uses the internal
  repository for external models

- also removed was statement SYSTEM RELOAD MODELS and the monitoring view
  SYSTEM.SYSTEMMODELS
2022-08-29 20:27:06 +00:00
Robert Schulze
6b2b3c1eb3
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-08-29 20:26:45 +00:00
Dan Roscigno
76a45aa750
Merge branch 'master' into add-backup 2022-08-29 16:23:53 -04:00
DanRoscigno
d37029dd82 updates for filename changes 2022-08-29 15:20:28 -04:00
DanRoscigno
576b7ea604 updates for filename changes 2022-08-29 14:39:15 -04:00
Denny Crane
29e7414697
Update merge-tree-settings.md 2022-08-29 15:25:46 -03:00
Denny Crane
19c3a9c6bf
Update external-dicts-dict-layout.md 2022-08-29 15:20:46 -03:00
Denny Crane
fe0f18f21d
Update external-dicts-dict-layout.md 2022-08-29 15:19:15 -03:00
DanRoscigno
687ac1805a updates for filename changes 2022-08-29 13:59:51 -04:00
Kseniia Sumarokova
c5c48e44ea
Merge branch 'master' into fix-mysql-timeouts 2022-08-29 19:33:29 +02:00
DanRoscigno
76a3212fc8 replace symlinks 2022-08-29 12:26:17 -04:00
DanRoscigno
c4b8137d31 replace symlinks 2022-08-29 12:19:50 -04:00
robot-clickhouse
92c14e80f1 Update version_date.tsv and changelogs after v22.3.12.19-lts 2022-08-29 14:52:19 +00:00
robot-clickhouse
57980161c9 Update version_date.tsv and changelogs after v22.6.7.7-stable 2022-08-29 14:44:03 +00:00
Filatenkov Artur
d73f661732
Merge branch 'master' into annoy-2 2022-08-29 17:33:13 +03:00
robot-clickhouse
4a229ad08c Update version_date.tsv and changelogs after v22.7.5.13-stable 2022-08-29 14:29:06 +00:00
kssenii
0a6c4b9265 Fix 2022-08-29 16:20:53 +02:00
robot-clickhouse
764e2e5ac8 Update version_date.tsv and changelogs after v22.8.3.13-lts 2022-08-29 14:05:36 +00:00
alesapin
7ce0afc0df
Merge pull request #40670 from Avogar/kafka
Add setting to disable limit on kafka_num_consumers
2022-08-29 10:53:35 +02:00
Cory Levy
deb2a97020 Add docs 2022-08-28 22:28:34 -04:00
DanRoscigno
753afd0584 update links 2022-08-28 20:41:29 -04:00