Commit Graph

97058 Commits

Author SHA1 Message Date
Robert Schulze
60f9f6855d
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-09-08 09:01:32 +00:00
Robert Schulze
68808858a5
Merge pull request #41050 from FrankChen021/exception_safe
Fix failed stress test (OpenTelemetry)
2022-09-08 09:19:54 +02:00
Robert Schulze
9d4de0cbaa
Merge pull request #40999 from ClickHouse/sse2-special-build
Add special x86-SSE2-only build
2022-09-08 09:06:29 +02:00
Alexey Milovidov
9544b8fdd6
Merge pull request #40996 from ClickHouse/vdimir/issue-40994
Minor update doc for mysql_port
2022-09-08 02:39:12 +03:00
Alexey Milovidov
84a00e3992
Merge pull request #41087 from peter279k/improve_clickhouse_start
Improve clickhouse start command
2022-09-08 02:35:02 +03:00
Nikolay Degterinsky
5f6699ab1e
Merge pull request #41093 from den-crane/patch-46
Doc. update date_diff
2022-09-07 21:23:02 +02:00
Denny Crane
a75eb5ad84
Update date-time-functions.md 2022-09-07 15:59:23 -03:00
Yuko Takagi
fb6b26c7a4
Update README.md (#41091) 2022-09-07 20:58:36 +02:00
Denny Crane
0071ef9e38
Update date-time-functions.md 2022-09-07 15:56:31 -03:00
peter279k
1ae54d3d16 Improve clickhouse start command 2022-09-08 01:18:27 +08:00
Kseniia Sumarokova
a270eeef91
Merge pull request #41008 from kssenii/refactor-merge-tree-read
Small refactoring around merge tree readers (get rid of data part ptr)
2022-09-07 18:27:33 +02:00
Dmitry Novik
499e479892
Merge pull request #40873 from azat/build/fix-debug-symbols-quirk
Fix debug symbols
2022-09-07 17:31:35 +02:00
alesapin
365438d617
Merge pull request #41016 from ClickHouse/one_more_logging
Slightly improve diagnostics and remove assertions
2022-09-07 15:23:17 +02:00
Vitaly Baranov
31ed722572
Merge pull request #41044 from vitlibar/more-conventional-conversion-yaml-to-xmk
More conventional conversion yaml to xml
2022-09-07 13:46:32 +02:00
Mikhail f. Shiryaev
0fcee94835
Merge pull request #41071 from peter279k/remove_non_existed_trains
Remove non-existed released trains
2022-09-07 13:18:18 +02:00
Kruglov Pavel
8513af5c1b
Merge pull request #40638 from azat/mergetree/insert-perf
Do not obtain storage snapshot for each INSERT block (slightly improves performance)
2022-09-07 12:53:09 +02:00
Robert Schulze
c07f234f09
fix: disable ENABLE_MULTITARGET_CODE for SSE2 builds 2022-09-07 10:52:31 +00:00
peter279k
b716988991 Remove non-existed released trains 2022-09-07 18:39:38 +08:00
Robert Schulze
bf8fed8be8
Revert "fix: don't force-inline SSE3 code into generic code"
This reverts commit d054ffd110.
2022-09-07 09:45:07 +00:00
Mikhail f. Shiryaev
7f9d9f6ec7
Merge pull request #41074 from ClickHouse/can-be-tested
CheckLabels: Print message that 'can be tested' is missing
2022-09-07 11:28:53 +02:00
Antonio Andelic
d15fdae8dc
Merge pull request #40853 from ClickHouse/embeddedrocksdb-delete-update-support
Support for `UPDATE` and `DELETE` in `EmbeddedRocksDB`
2022-09-07 11:03:49 +02:00
alesapin
81c98dadd2 Remove redundant change 2022-09-07 11:01:06 +02:00
robot-clickhouse
01139d9d28 Automatic style fix 2022-09-07 08:51:02 +00:00
Robert Schulze
497b65f41d
CheckLabels: Print message that 'can be tested' is missing 2022-09-07 08:42:53 +00:00
Frank Chen
fc05b05be3 Fix style and typo
Signed-off-by: Frank Chen <frank.chen021@outlook.com>
2022-09-07 15:14:43 +08:00
Frank Chen
de8f6bdce7 More safe
Signed-off-by: Frank Chen <frank.chen021@outlook.com>
2022-09-07 13:39:12 +08:00
Dan Roscigno
3735a960c3
Merge pull request #41064 from DanRoscigno/rehome-NY-crime-data
move doc
2022-09-06 18:58:17 -04:00
DanRoscigno
3073da9ba5 move doc 2022-09-06 18:39:06 -04:00
Nikolay Degterinsky
981e9dbce2
Merge pull request #40997 from canhld94/ch_canh_fix_grouping_set
Fix grouping set with group_by_use_nulls
2022-09-07 00:08:31 +02:00
alesapin
6ded03c000 Disable fetch shortcut for zero copy replication 2022-09-07 00:00:10 +02:00
Robert Schulze
d054ffd110
fix: don't force-inline SSE3 code into generic code
Force-inlining code compiled for SSE3 into "generic"
(non-platform-specific) code works for standard x86 builds where
everything is compiled with SSE 4.2 (and smaller). It no longer works
if we compile everything only with SSE 2.
2022-09-06 18:23:19 +00:00
alesapin
43493389b9 Merge branch 'master' into one_more_logging 2022-09-06 19:40:24 +02:00
alesapin
b778b9f37f Improve logging better 2022-09-06 19:25:58 +02:00
alesapin
09e97a6381 Fix style 2022-09-06 18:38:34 +02:00
alesapin
ceed9f418b Return better errors handling 2022-09-06 18:22:44 +02:00
Robert Schulze
cc1bd3ac36
fix: disable vectorscan when building w/o SSE >=3 2022-09-06 16:15:50 +00:00
Dan Roscigno
2587ba96c3
Merge pull request #41055 from DanRoscigno/update-backup-frontmatter
move title to frontmatter in Backup docs
2022-09-06 11:53:40 -04:00
DanRoscigno
7032a1b267 move title to frontmatter 2022-09-06 11:14:55 -04:00
Vitaly Baranov
63e992d52d Edit test configs. 2022-09-06 17:09:26 +02:00
Frank Chen
6ced4131ca exception safe
Signed-off-by: Frank Chen <frank.chen021@outlook.com>
2022-09-06 22:11:47 +08:00
Kseniia Sumarokova
3558361a05
Merge branch 'master' into refactor-merge-tree-read 2022-09-06 16:00:43 +02:00
Vitaly Baranov
6b55f4dd68 Use pretty-print to output preprocessed configs in readable form. 2022-09-06 15:01:47 +02:00
Vitaly Baranov
e9b75deeba Make conversion YAML->XML more conventional. 2022-09-06 15:01:41 +02:00
alesapin
422b1658eb Review fix 2022-09-06 14:42:48 +02:00
Robert Schulze
652f1bfd19
fix: pass -DNO_SSE3_OR_HIGHER=1 from packager 2022-09-06 12:18:11 +00:00
alesapin
6ea7f1e011 Better exception handling for ReadBufferFromS3 2022-09-06 13:59:55 +02:00
Nikita Taranov
7c4f42d014
Skip empty literals in lz4 decompression (#40142) 2022-09-06 13:58:26 +02:00
Sergei Trifonov
f77809ddbc
Merge pull request #40900 from ClickHouse/s3-detailed-metrics
S3 detailed metrics
2022-09-06 13:20:57 +02:00
Robert Schulze
1ebcc3a14e
fix: endswidth --> endswith 2022-09-06 07:41:37 +00:00
Alexey Milovidov
7776512b04
Merge pull request #41002 from azat/ci/fix-oom-check
ci/stress: clear dmesg before run to fix "OOM in dmesg" check
2022-09-06 06:41:36 +03:00