Commit Graph

114 Commits

Author SHA1 Message Date
Nikita Taranov
053285dc1c Merge branch 'master' into keep_alive_max_reqs 2024-08-06 20:55:48 +01:00
Raúl Marín
e559899193 Random missing headers 2024-07-12 00:00:47 +02:00
Raúl Marín
9e599576ab Hide Settings object from Context.h 2024-07-11 14:00:05 +02:00
Nikita Taranov
b8e803218c remote shart 2024-06-27 13:32:23 +01:00
Nikita Taranov
89a8925f80 impl 2024-06-26 12:02:15 +01:00
Robert Schulze
a14e58ab88
Merge remote-tracking branch 'rschu1ze/master' into redundant-inline 2024-05-21 05:15:35 +00:00
Robert Schulze
113bb00005
Fix clang-tidy "-readability-redundant-inline-specifier" 2024-05-19 10:23:58 +00:00
Robert Schulze
2909e6451b
Move StringUtils.h/cpp back to Common/ 2024-05-19 09:39:36 +00:00
Alexey Milovidov
0d14a2c67e Useless changes 2024-05-09 03:11:02 +02:00
Nikita Taranov
a2147b8ded
Merge branch 'master' into keep_alive_max_reqs 2024-04-09 21:32:36 +02:00
Alexey Milovidov
c8da569459 Merge branch 'master' into bridges-are-separate 2024-04-04 20:04:31 +02:00
Robert Schulze
de2a0be025
Don't access static members through instance
- clang-tidy rightfully complains (-readability-static-accessed-through-instance)
- not going to enable the warning for now to avoid breaking the build
2024-04-03 18:50:33 +00:00
Alexey Milovidov
89cee0a3d6 Move bridges to separate packages 2024-03-31 01:59:36 +01:00
Nikita Taranov
64e6c6a2fc fix tidy 2024-03-26 22:32:22 +00:00
Nikita Taranov
b93f483a0e fix build 2024-03-22 20:07:12 +00:00
Nikita Taranov
d7b34a80bb stash 2024-03-22 16:09:14 +00:00
Alexey Milovidov
4373d5ba16 Merge branch 'master' into split-cast-overload-resolver 2024-03-11 03:01:50 +01:00
Alexey Milovidov
3ac4f56cfa Fix tests 2024-03-09 18:53:31 +01:00
Alexey Milovidov
e9ab3ed2dd Even better 2024-03-09 09:27:11 +01:00
Alexey Milovidov
47b308d234 Simplify bridges 2024-03-09 08:42:33 +01:00
Alexey Milovidov
70796e497f Miscellaneous 2024-03-09 08:32:13 +01:00
Alexey Milovidov
574d486322 Something 2024-03-09 07:55:59 +01:00
Alexey Milovidov
ea54ac3cb4 Remove garbage 2024-03-09 06:12:22 +01:00
Alexey Milovidov
2be09581dd Split CastOverloadResolver translation unit 2024-03-09 05:48:52 +01:00
Maksim Kita
2a327107b6 Updated implementation 2024-01-25 14:31:49 +03:00
Yakov Olkhovskiy
feab812712 read body of HTTP POST request to prevent 'connection reset by peer' error on client side 2024-01-07 14:40:12 +00:00
Yakov Olkhovskiy
85f03478ef
Revert "Revert "Use CH Buffer for HTTP out stream, add metrics for interfaces"" 2024-01-03 11:47:15 -05:00
Raúl Marín
d491758939
Revert "Use CH Buffer for HTTP out stream, add metrics for interfaces" 2024-01-03 10:42:15 +01:00
Yakov Olkhovskiy
db97764e98 fix tests, some refactoring 2023-12-31 12:56:37 +00:00
Yakov Olkhovskiy
001a38048f use ProfileEvents instead of CurrentMetrics 2023-12-15 19:17:42 +00:00
Yakov Olkhovskiy
4f11132ea2 fix clang tidy 2023-11-04 00:00:14 +00:00
Yakov Olkhovskiy
0cf851316c use CH Buffer for HTTP out stream, add metrics for interfaces 2023-10-27 02:38:36 +00:00
Alexey Milovidov
d3c3d8b8e4 Remove export of dynamic symbols 2023-05-06 23:52:16 +02:00
alex filatov
bafd9773bc
fix Unknown library method 'extDict_libClone'
We have an issue when using external dictionary. Occasionally library bridge called with extDict_libClone and fails with Unknown library method 'extDict_libClone'. And it looks like it is because of at some point `else if (method == "extDict_libNew")` was changed to  if (lib_new) with no handling for extDict_libClone inside this new if else statement and reporing an error that extDict_libClone is an unknown method.

So there is a two-line fix to handle extDict_libClone properly.

Error logs that we have:

```
2022.12.16 14:17:44.285088 [ 393573 ] {} <Error> ExternalDictionaries: Could not update cache dictionary 'dict.vhash_s', next update is scheduled at 2022-12-16 14:18:00: Code: 86. DB::Exception: Received error from remote server /extdict_request?version=1&dictionary_id=be2b2cd1-ba57-4658-8d1b-35ef40ab005b&method=extDict_libClone&from_dictionary_id=c3537142-eaa9-4deb-9b65-47eb8ea1dee6. HTTP status code: 500 Internal Server Error, body: Unknown library method 'extDict_libClone'
2022.12.16 14:17:44.387049 [ 399133 ] {} <Error> ExternalDictionaries: Could not update cache dictionary 'dict.vhash_s', next update is scheduled at 2022-12-16 14:17:51: Code: 86. DB::Exception: Received error from remote server /extdict_request?version=1&dictionary_id=0df866ac-6c94-4974-a76c-3940522091b9&method=extDict_libClone&from_dictionary_id=c3537142-eaa9-4deb-9b65-47eb8ea1dee6. HTTP status code: 500 Internal Server Error, body: Unknown library method 'extDict_libClone'
2022.12.16 14:17:44.488468 [ 397769 ] {} <Error> ExternalDictionaries: Could not update cache dictionary 'dict.vhash_s', next update is scheduled at 2022-12-16 14:19:38: Code: 86. DB::Exception: Received error from remote server /extdict_request?version=1&dictionary_id=2d8af321-b669-4526-982b-42c0fabf0e8d&method=extDict_libClone&from_dictionary_id=c3537142-eaa9-4deb-9b65-47eb8ea1dee6. HTTP status code: 500 Internal Server Error, body: Unknown library method 'extDict_libClone'
2022.12.16 14:17:44.489935 [ 398226 ] {datamarts_v_dwh_node0032-241534:0x552da2_1_11} <Error> executeQuery: Code: 510. DB::Exception: Update failed for dictionary 'dict.vhash_s': Code: 510. DB::Exception: Update failed for dictionary dict.vhash_s : Code: 86. DB::Exception: Received error from remote server /extdict_request?version=1&dictionary_id=be2b2cd1-ba57-4658-8d1b-35ef40ab005b&method=extDict_libClone&from_dictionary_id=c3537142-eaa9-4deb-9b65-47eb8ea1dee6. HTTP status code: 500 Internal Server Error, body: Unknown library method 'extDict_libClone'
```
2023-03-02 15:53:09 +03:00
Alexander Tokmakov
31e16c4b4d fix 2023-01-24 15:29:19 +01:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
Robert Schulze
9c066e964d
Less use of CH-specific bit_cast()
Converted usage of CH-custom bit_cast to std::bit_cast if possible, i.e.
when
  sizeof(From) == sizeof(To).
(The CH-custom bit_cast is able to deal with sizeof(From) != sizeof(To).)

Motivation for this came from #42847 where it is not clear how the
internal bit_cast should behave on big endian systems, so we better
avoid that situation as much as possible.
2022-11-04 15:52:48 +00:00
Alexey Milovidov
bac578b23a
Merge pull request #41428 from ClickHouse/remove-dlopen
Remove `dlopen`
2022-09-18 00:09:57 +03:00
Alexey Milovidov
ada7a44ae4 Remove -WithTerminatingZero methods 2022-09-17 05:34:18 +02:00
Alexey Milovidov
35cce03125 Remove dlopen 2022-09-17 03:02:34 +02:00
Robert Schulze
fd97058e45
fix: incorporate review comments 2022-09-14 15:21:24 +00:00
Robert Schulze
fac1be9700
chore: restore SYSTEM RELOAD MODEL(S) and moniting view SYSTEM.MODELS
- This commit restores statements "SYSTEM RELOAD MODEL(S)" which provide
  a mechanism to update a model explicitly. It also saves potentially
  unnecessary reloads of a model from disk after it's initial load.

  To keep the complexity low, the semantics of "SYSTEM RELOAD MODEL(S)
  was changed from eager to lazy. This means that both statements
  previously immedately reloaded the specified/all models, whereas now
  the statements only trigger an unload and the first call to
  catboostEvaluate() does the actual load.

- Monitoring view SYSTEM.MODELS is also restored but with some obsolete
  fields removed. The view was not documented in the past and for now it
  remains undocumented. The commit is thus not considered a breach of
  ClickHouse's public interface.
2022-09-12 19:33:02 +00:00
Robert Schulze
60f9f6855d
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-09-08 09:01:32 +00:00
Robert Schulze
912663b719
Revert "Move CatBoost evaluation into clickhouse-library-bridge" 2022-08-31 20:54:43 +02:00
Robert Schulze
4ed1e1a5be
perf: don't copy data around unnecessarily 2022-08-29 20:27:06 +00:00
Robert Schulze
35a37c91f8
chore: incorporate review feedback 2022-08-29 20:27:06 +00:00
robot-clickhouse
64fa077148
style: fix style 2022-08-29 20:27:06 +00:00
Robert Schulze
6b2b3c1eb3
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-08-29 20:26:45 +00:00
Robert Schulze
810221baf2
Assume unversioned server has version=0 and use tryParse() instead of from_chars() 2022-08-10 07:39:32 +00:00
Robert Schulze
e0d5020a92
Add simple versioning to the *-bridge-to-server protocol
- In general, it is expected that clickhouse-*-bridges and
  clickhouse-server were build from the same source version (e.g. are
  upgraded "atomically"). If that is not the case, we should at least
  be able to detect the mismatch and abort.

- This commit adds a URL parameter "version", defined in a header shared
  by the server and bridges. The bridge returns an error in case of
  mismatch.

- The version is *not* send and checked for "ping" requests (used for
  handshake), only for regular requests send after handshake. This is
  because the internally thrown server-side exception due to HTTP
  failure does not propagate the exact HTTP error (it only stores the
  error as text), and as a result, the server-side handshake code
  simply retries in case of error with exponential backoff and finally
  fails with a "timeout error". This is reasonable as pings typically
  fail due to time out. However, without a rework of HTTP exceptions,
  version mismatch during ping would also appear as "timeout" which is
  too misleading. The behavior may be changed later if needed.

- Note that introducing a version parameter does not represent a
  protocol upgrade itself. Bridges older than the server will simply
  ignore the field. Only servers older than the bridges receive an error
  but such a situation should never occur in practice.
2022-08-08 19:40:37 +00:00