Commit Graph

127 Commits

Author SHA1 Message Date
Yakov Olkhovskiy
f07a395bf1 Merge branch 'master' into ci-fuzzer-enable 2024-07-17 03:43:57 +00:00
Nikita Taranov
7fe35a83b6 impl 2024-07-15 14:26:30 +01:00
Yakov Olkhovskiy
a9aaa2ab78 Merge remote-tracking branch 'origin/master' into ci-fuzzer-enable 2024-07-06 00:48:09 +00:00
Antonio Andelic
07f51e02ed Reuse some checks 2024-07-03 16:54:09 +02:00
Antonio Andelic
6b47171f2c Keeper binary with different entrypoint 2024-07-01 10:52:08 +02:00
Robert Schulze
2909e6451b
Move StringUtils.h/cpp back to Common/ 2024-05-19 09:39:36 +00:00
Yakov Olkhovskiy
8357bc7b1b fix build 2024-03-31 23:33:35 +00:00
Alexey Milovidov
1a61da1bae Replace getFuzzerData with query text fuzzer in clickhouse-local 2024-03-18 02:17:24 +01:00
Nikita Mikhaylov
2bc4d27ac6 Bye bye 2024-03-07 19:24:39 +00:00
Azat Khuzhin
13e3877254 Add chc/chl/ch into clickhouse-bundle target
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-02-12 14:32:15 +01:00
Azat Khuzhin
11fddc8d63 Unify binary aliases
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-02-12 12:06:33 +01:00
Azat Khuzhin
3145c5d5f5 Add missing install target for ch/chc/chl
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-02-12 12:02:42 +01:00
Azat Khuzhin
7fb31fe160 Remove ability to disable generic clickhouse components
Components like client/server/... are very generic, and there is no
point in disabling them, since it does not reduce amount of compiled
code a lot anyway (just a few modules for entrypoints, everything else
is already included in the clickhouse binary), and eventually they are
just symlinks to the clickhouse binary.

But there are few, that requires extra libraries, like ODBC bridge or
keeper components (and there is also standalone keeper binary compiled
with musl), those had been kept.

Also add some descriptions for some utils and change exit code to 0 for
--help.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-02-12 11:10:00 +01:00
Azat Khuzhin
c5bf722ee2 Create ch/chc/chl symlinks by cmake as well (for develop mode)
Before, they had been created only by install target.

Follow-up for: #56634

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-01-08 21:27:53 +03:00
János Benjamin Antal
4bfbbfbd75 Fix proto file installation 2023-11-27 13:39:01 +00:00
János Benjamin Antal
6f652133a8 Install well-known protobuf types 2023-11-22 12:39:29 +00:00
Alexey Milovidov
241cc2abf4 Merge branch 'master' into remove-useless-install 2023-11-20 01:12:08 +01:00
Alexey Milovidov
3ef14f6098 Merge branch 'master' of github.com:ClickHouse/ClickHouse into coverage 2023-11-14 06:08:32 +01:00
Alexey Milovidov
c1bba6ea4a Merge branch 'master' into filter-large-translation-units 2023-11-13 17:54:02 +01:00
Alexey Milovidov
4b1fa685bb Publish stripped binary 2023-11-11 07:28:26 +01:00
Alexey Milovidov
df24ef42b1 Publish stripped binary 2023-11-11 07:27:10 +01:00
Alexey Milovidov
96f73139b6 Check for large translation units 2023-11-10 06:13:55 +01:00
Alexey Milovidov
70e3dd808c Granular code coverage with introspection 2023-10-29 02:07:24 +01:00
pufit
e01e0d53a9 Add embedded keeper-client to keeper standalone binary 2023-06-15 12:08:20 -04:00
pufit
c93202cca4 Keeper Client MVP 2023-03-31 12:41:22 +00:00
Robert Schulze
f8980c582e
CMake: More removal of gold linker (follow-up to #47660)
+ fix a linker warning
2023-03-17 11:01:46 +00:00
Mikhail f. Shiryaev
08ffb8f93d
Install only "programs" directory during build 2023-02-15 11:49:19 +01:00
Robert Schulze
27f5aad49e
What happens if I remove 156 lines of code? 2023-01-03 18:51:16 +00:00
Robert Schulze
cfb6feffde
What happens if I remove these 139 lines of code? 2023-01-03 18:35:31 +00:00
Alexey Milovidov
47ae8c5c79 Remove more lines 2023-01-02 02:06:11 +01:00
Robert Schulze
db5ef7b3cb
Merge branch 'master' into generated-file-cleanup 2022-10-02 23:13:18 +02:00
Alexey Milovidov
8e4531d135 Preparation for Musl build, part 4 2022-10-01 16:29:41 +02:00
Robert Schulze
06507c40de
${ConfigIncludePath} --> ${CONFIG_INCLUDE_PATH} 2022-09-28 08:28:47 +00:00
Robert Schulze
60f9f6855d
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-09-08 09:01:32 +00:00
Robert Schulze
912663b719
Revert "Move CatBoost evaluation into clickhouse-library-bridge" 2022-08-31 20:54:43 +02:00
Robert Schulze
6b2b3c1eb3
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-08-29 20:26:45 +00:00
Yakov Olkhovskiy
31a7ed09a1 disable default ENABLE_CLICKHOUSE_SELF_EXTRACTING and add to env 2022-08-27 21:08:01 +00:00
Robert Schulze
ad0d060dc1
Merge pull request #39904 from ClickHouse/library-bridge-refactoring
Prepare library-bridge for catboost integration
2022-08-08 12:15:01 +02:00
Yakov Olkhovskiy
b1f45fa787 Don't create self-extracting clickhouse for split build 2022-08-05 21:48:40 -04:00
Robert Schulze
ea73b98fb9
Prepare library-bridge for catboost integration
- Rename generic file and identifier names in library-bridge to
  something more dictionary-specific. This is needed because later on,
  catboost will be integrated into library-bridge.

- Also: Some smaller fixes like typos and un-inlining non-performance
  critical code.

- The logic remains unchanged in this commit.
2022-08-04 19:26:51 +00:00
Robert Schulze
dcc8751685
Disable harmful env var check to workaround failure to start the server 2022-07-31 08:55:07 +00:00
Robert Schulze
7c23e48b5b
Revert exclusion of libharmful (did not work anyways) 2022-07-31 08:05:12 +00:00
Robert Schulze
7fe106a0fb
Try to fix libharmful fail 2022-07-31 07:44:25 +00:00
Robert Schulze
3d1797f75f
Merge remote-tracking branch 'origin/master' into no-split-binary 2022-07-29 12:17:43 +00:00
Azat Khuzhin
b90152b6ec Fix clickhouse-su building in splitted build
- Add status log message
- Add it to clickhouse-bundle in shared build
- Move clickhouse-su.cpp into su.cpp, since executable does not have
  include directories of linked libraries (dbms here), only
  clickhouse-lib-su does, hence it cannot find includes

CI: https://github.com/ClickHouse/ClickHouse/runs/7566319416?check_suite_focus=true
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-07-29 11:36:51 +03:00
Robert Schulze
199e254777
Merge remote-tracking branch 'origin/master' into no-split-binary 2022-07-28 15:54:22 +00:00
Alexey Milovidov
071374b152 Remove SPLIT_BINARY 2022-07-24 01:15:54 +02:00
Yakov Olkhovskiy
e5f165d909
Merge branch 'master' into cmake-self-extracting-executable 2022-07-13 16:09:18 -04:00
Robert Schulze
1a7727a254
Prefix overridden add_executable() command with "clickhouse_"
A simple HelloWorld program with zero includes except iostream triggers
a build of ca. 2000 source files. The reason is that ClickHouse's
top-level CMakeLists.txt overrides "add_executable()" to link all
binaries against "clickhouse_new_delete". This links against
"clickhouse_common_io", which in turn has lots of 3rd party library
dependencies ... Without linking "clickhouse_new_delete", the number of
compiled files for "HelloWorld" goes down to ca. 70.

As an example, the self-extracting-executable needs none of its current
dependencies but other programs may also benefit.

In order to restore access to the original "add_executable()", the
overriding version is now prefixed. There is precedence for a
"clickhouse_" prefix (as opposed to "ch_"), for example
"clickhouse_split_debug_symbols". In general prefixing makes sense also
because overriding CMake commands relies on undocumented behavior and is
considered not-so-great practice (*).

(*) https://crascit.com/2018/09/14/do-not-redefine-cmake-commands/
2022-07-11 19:36:18 +02:00
Yakov Olkhovskiy
8a3f124982 add self-extracting to clickhouse-bundle 2022-07-06 22:01:21 -04:00