Commit Graph

101 Commits

Author SHA1 Message Date
Mikhail f. Shiryaev
08ffb8f93d
Install only "programs" directory during build 2023-02-15 11:49:19 +01:00
Robert Schulze
27f5aad49e
What happens if I remove 156 lines of code? 2023-01-03 18:51:16 +00:00
Robert Schulze
cfb6feffde
What happens if I remove these 139 lines of code? 2023-01-03 18:35:31 +00:00
Alexey Milovidov
47ae8c5c79 Remove more lines 2023-01-02 02:06:11 +01:00
Robert Schulze
db5ef7b3cb
Merge branch 'master' into generated-file-cleanup 2022-10-02 23:13:18 +02:00
Alexey Milovidov
8e4531d135 Preparation for Musl build, part 4 2022-10-01 16:29:41 +02:00
Robert Schulze
06507c40de
${ConfigIncludePath} --> ${CONFIG_INCLUDE_PATH} 2022-09-28 08:28:47 +00:00
Robert Schulze
60f9f6855d
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-09-08 09:01:32 +00:00
Robert Schulze
912663b719
Revert "Move CatBoost evaluation into clickhouse-library-bridge" 2022-08-31 20:54:43 +02:00
Robert Schulze
6b2b3c1eb3
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-08-29 20:26:45 +00:00
Yakov Olkhovskiy
31a7ed09a1 disable default ENABLE_CLICKHOUSE_SELF_EXTRACTING and add to env 2022-08-27 21:08:01 +00:00
Robert Schulze
ad0d060dc1
Merge pull request #39904 from ClickHouse/library-bridge-refactoring
Prepare library-bridge for catboost integration
2022-08-08 12:15:01 +02:00
Yakov Olkhovskiy
b1f45fa787 Don't create self-extracting clickhouse for split build 2022-08-05 21:48:40 -04:00
Robert Schulze
ea73b98fb9
Prepare library-bridge for catboost integration
- Rename generic file and identifier names in library-bridge to
  something more dictionary-specific. This is needed because later on,
  catboost will be integrated into library-bridge.

- Also: Some smaller fixes like typos and un-inlining non-performance
  critical code.

- The logic remains unchanged in this commit.
2022-08-04 19:26:51 +00:00
Robert Schulze
dcc8751685
Disable harmful env var check to workaround failure to start the server 2022-07-31 08:55:07 +00:00
Robert Schulze
7c23e48b5b
Revert exclusion of libharmful (did not work anyways) 2022-07-31 08:05:12 +00:00
Robert Schulze
7fe106a0fb
Try to fix libharmful fail 2022-07-31 07:44:25 +00:00
Robert Schulze
3d1797f75f
Merge remote-tracking branch 'origin/master' into no-split-binary 2022-07-29 12:17:43 +00:00
Azat Khuzhin
b90152b6ec Fix clickhouse-su building in splitted build
- Add status log message
- Add it to clickhouse-bundle in shared build
- Move clickhouse-su.cpp into su.cpp, since executable does not have
  include directories of linked libraries (dbms here), only
  clickhouse-lib-su does, hence it cannot find includes

CI: https://github.com/ClickHouse/ClickHouse/runs/7566319416?check_suite_focus=true
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-07-29 11:36:51 +03:00
Robert Schulze
199e254777
Merge remote-tracking branch 'origin/master' into no-split-binary 2022-07-28 15:54:22 +00:00
Alexey Milovidov
071374b152 Remove SPLIT_BINARY 2022-07-24 01:15:54 +02:00
Yakov Olkhovskiy
e5f165d909
Merge branch 'master' into cmake-self-extracting-executable 2022-07-13 16:09:18 -04:00
Robert Schulze
1a7727a254
Prefix overridden add_executable() command with "clickhouse_"
A simple HelloWorld program with zero includes except iostream triggers
a build of ca. 2000 source files. The reason is that ClickHouse's
top-level CMakeLists.txt overrides "add_executable()" to link all
binaries against "clickhouse_new_delete". This links against
"clickhouse_common_io", which in turn has lots of 3rd party library
dependencies ... Without linking "clickhouse_new_delete", the number of
compiled files for "HelloWorld" goes down to ca. 70.

As an example, the self-extracting-executable needs none of its current
dependencies but other programs may also benefit.

In order to restore access to the original "add_executable()", the
overriding version is now prefixed. There is precedence for a
"clickhouse_" prefix (as opposed to "ch_"), for example
"clickhouse_split_debug_symbols". In general prefixing makes sense also
because overriding CMake commands relies on undocumented behavior and is
considered not-so-great practice (*).

(*) https://crascit.com/2018/09/14/do-not-redefine-cmake-commands/
2022-07-11 19:36:18 +02:00
Yakov Olkhovskiy
8a3f124982 add self-extracting to clickhouse-bundle 2022-07-06 22:01:21 -04:00
Yakov Olkhovskiy
c6db15458a build utils/self-extracting-executable/compressor whenever we want to build compressed binary 2022-07-06 20:40:41 -04:00
Robert Schulze
59236d60c9
Merge pull request #38654 from ClickHouse/better-naming-for-split-debug-symbols
Better naming for stuff related to splitted debug symbols
2022-07-01 09:28:41 +02:00
Robert Schulze
bb358617e1
Better naming for stuff related to splitted debug symbols
The previous name was slightly misleading, e.g. it is not about
"intalling stripped binaries" but about splitting debug symbols from the
binary.
2022-06-30 23:41:27 +02:00
Yakov Olkhovskiy
5d36994c4d self-extracting requires utils (uses utils/self-extracting-executable/compressor) 2022-06-27 11:41:23 -04:00
mergify[bot]
4e5fd226c8
Merge branch 'master' into utility-self-extracting 2022-06-27 12:26:16 +00:00
Yakov Olkhovskiy
8ce6b8226d
Update CMakeLists.txt 2022-06-27 08:25:21 -04:00
Yakov Olkhovskiy
39ea5ffdcb compress clickhouse executable, new target 'self-extracted' is added 2022-06-27 01:36:27 -04:00
Robert Schulze
bc46cef63c
Minor follow-up
- change ELF section name to ".clickhouse.hash" (lowercase seems
  standard)

- more expressive/concise integrity check messages at startup
2022-06-14 08:52:13 +00:00
Robert Schulze
bc6f30fd40
Move binary hash to ELF section ".ClickHouse.hash" 2022-06-13 08:46:23 +00:00
Varinara
ed6e8176fe Add basic commands for disk tool (list-disks, list, move, remove, link, copy, read, write) + tests 2022-06-06 16:52:58 +03:00
Alexey Milovidov
fd7642b6aa Fix "splitted" build 2022-05-24 06:04:48 +02:00
Alexey Milovidov
2f93f11144 Maybe better 2022-05-23 02:03:13 +02:00
alesapin
f88f654798
Update programs/CMakeLists.txt
Co-authored-by: Mikhail f. Shiryaev <felixoid@clickhouse.com>
2022-04-22 11:30:35 +02:00
Mikhail f. Shiryaev
49a572e00c
Build only clickhouse-keeper with musl 2022-04-21 13:43:24 +02:00
alesapin
ba81816dc1 Better cmake 2022-04-20 12:11:55 +02:00
alesapin
2f496c7945 Merge branch 'master' into musl-check 2022-04-12 14:40:47 +02:00
alesapin
e790a73081 Simplify strip for new packages 2022-03-23 15:14:30 +01:00
Mikhail f. Shiryaev
1d362796df
Fix strip bug 2022-03-22 11:10:02 +01:00
Mikhail f. Shiryaev
fa2a9bb9aa
Separate BUILD_STRIPPED_BINARIES_PREFIX to option and parameter 2022-03-22 11:10:02 +01:00
alesapin
96c0e9fddf Better cmake 2022-03-11 15:47:07 +01:00
alesapin
e53578910b Add ability to strip binaries in cmake 2022-03-10 22:23:28 +01:00
Azat Khuzhin
4a0facd341 Remove MAKE_STATIC_LIBRARIES (in favor of USE_STATIC_LIBRARIES)
There is no more MAKE_*, so remove this alias.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-01-24 17:28:33 +03:00
Azat Khuzhin
7496ed7fde Remove unbundled gtest support
v2: Fix unit tests (do not rely on USE_GTEST)
2022-01-20 10:01:54 +03:00
Azat Khuzhin
5c32f6dd3e Remove unbundled nuraft support 2022-01-20 08:47:16 +03:00
Alexey Milovidov
70ad617062 Disable odbc-bridge and library-bridge 2022-01-09 11:34:27 +03:00
Alexey Boykov
c5cb4e071c Creating only one binary, check compatibility 2021-10-07 21:01:36 +03:00