- The deleted function modelEvaluate() was superseded by
catboostEvaluate().
- Also delete the external model repository, as modelEvaluate() was it's
last user. Additionally remove the system view SYSTEM.MODELS for
inspecting the repository.
- SYSTEM RELOAD MODELS is also obsolete. HOWEVER, it was retained and
made a no-op instead of deleted.
Why?
The reason is that RBAC in distributed setups works by storing
privileges (granted and revoked) as plain SQL statements in Keeper.
Nodes read these statements at startup and parse them. If a privilege
for SYSTEM RELOAD MODELS exists but parser doesn't recognize it
nodes would fail to come up.
Considered but rejected alternatives:
- Ignore SYSTEM RELOAD MODELS during parsing RBAC privileges and
return an error for regular SYSTEM RELOAD MODELS SQL. Special-case
of no-op behavior, too brittle.
- Remove SYSTEM RELOAD MODELS manually from Keeper via command-line
manipulation of Keeper nodes or via SQL by dropping the privileges.
Needs user intervention during upgrade.
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.
SQL syntax:
SELECT
catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
ACTION AS target
FROM amazon_train
LIMIT 10
Required configuration:
<catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
*** Implementation Details ***
The internal protocol between the server and the library-bridge is
simple:
- HTTP GET on path "/extdict_ping":
A ping, used during the handshake to check if the library-bridge runs.
- HTTP POST on path "extdict_request"
(1) Send a "catboost_GetTreeCount" request from the server to the
bridge, containing a library path (e.g /home/user/libcatboost.so) and
a model path (e.g. /home/user/model.bin). Rirst, this unloads the
catboost library handler associated to the model path (if it was
loaded), then loads the catboost library handler associated to the
model path, then executes GetTreeCount() on the library handler and
finally sends the result back to the server. Step (1) is called once
by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
library path handler is unloaded in the beginning because it contains
state which may no longer be valid if the user runs
catboost("/path/to/model.bin", ...) more than once and if "model.bin"
was updated in between.
(2) Send "catboost_Evaluate" from the server to the bridge, containing
the model path and the features to run the interference on. Step (2)
is called multiple times (once per chunk) by the server from function
FunctionCatBoostEvaluate::executeImpl(). The library handler for the
given model path is expected to be already loaded by Step (1).
Fixes#27870
- In general, it is expected that clickhouse-*-bridges and
clickhouse-server were build from the same source version (e.g. are
upgraded "atomically"). If that is not the case, we should at least
be able to detect the mismatch and abort.
- This commit adds a URL parameter "version", defined in a header shared
by the server and bridges. The bridge returns an error in case of
mismatch.
- The version is *not* send and checked for "ping" requests (used for
handshake), only for regular requests send after handshake. This is
because the internally thrown server-side exception due to HTTP
failure does not propagate the exact HTTP error (it only stores the
error as text), and as a result, the server-side handshake code
simply retries in case of error with exponential backoff and finally
fails with a "timeout error". This is reasonable as pings typically
fail due to time out. However, without a rework of HTTP exceptions,
version mismatch during ping would also appear as "timeout" which is
too misleading. The behavior may be changed later if needed.
- Note that introducing a version parameter does not represent a
protocol upgrade itself. Bridges older than the server will simply
ignore the field. Only servers older than the bridges receive an error
but such a situation should never occur in practice.
Wall of text, sorry, but I also had to document some stuff for myself:
There are three ways to communicate data using HTTP:
- the HTTP verb: for our purposes, PUT and GET,
- the HTTP path: '/ping', '/request' etc.,
- the HTTP URL parameter(s), e.g. 'method=libNew&dictionary_id=1234'
The bridge will use different handlers for communication with the
external dictionary library and for communication with the catboost
library. Handlers are created based on a combination of the HTTP verb
and the HTTP method. More specifically, there will be combinations
- GET + '/extdict_ping'
- PUT + '/extdict_request'
- GET + '/catboost_ping'
- PUT + '/catboost_request'.
For each combination, the bridge expects a certain set of URL
parameters, e.g. for the first combination parameter "dictionary_id" is
expected.
Starting with this commit, the library-bridge creates handlers based on
the first two combinations (the latter two combinations will be added
later). This makes the handler creation mechanism consistent with it's
counterpart in xdbc-bridge.
For that, it was necessary to make both IBridgeHelper methods
"getMainURI()" and "getPingURI()" pure virtual so that derived classes
(LibraryBridgeHelper and XDBCBridgeHelper) must provide custom URLs with
custom paths.
Side note 1: Previously, LibraryBridgeHelper sent HTTP URL parameter
"method=ping" during handshake (PING) but the library-bridge ignored
that parameter. We now omit this parameter, i.e.
LibraryBridgeHelper::PING was removed. Again, this makes things
consistent with xdbc-bridge.
Side note 2: xdbc-bridge is unchanged in this commit. Therefore,
XDBCBridgeHelper now uses the HTTP paths previously in the base class.
For funny reason, XDBCBridgeHelper did not use
IBridgeHelper::getMainURI() - it generates the URLs by itself. I kept it
that way for now but provided an implementation of getMainURI() anyways.
- Rename generic file and identifier names in library-bridge to
something more dictionary-specific. This is needed because later on,
catboost will be integrated into library-bridge.
- Also: Some smaller fixes like typos and un-inlining non-performance
critical code.
- The logic remains unchanged in this commit.
- Introduced with the C++20 <bit> header
- The problem with __builtin_c(l|t)z() is that 0 as input has an
undefined result (*) and the code did not always check. The std::
versions do not have this issue.
- In some cases, we continue to use buildin_c(l|t)z(), (e.g. in
src/Common/BitHelpers.h) because the std:: versions only accept
unsigned inputs (and they also check that) and the casting would be
ugly.
(*) https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
- Add status log message
- Add it to clickhouse-bundle in shared build
- Move clickhouse-su.cpp into su.cpp, since executable does not have
include directories of linked libraries (dbms here), only
clickhouse-lib-su does, hence it cannot find includes
CI: https://github.com/ClickHouse/ClickHouse/runs/7566319416?check_suite_focus=true
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
CI found [1]:
Direct leak of 256 byte(s) in 1 object(s) allocated from:
0 0xd8cb88d in operator new(unsigned long) (/usr/bin/clickhouse+0xd8cb88d) (BuildId: 7a3fd7b485701220)
1 0xde8943e in DB::DisksApp::main() build_docker/../programs/disks/DisksApp.cpp:157:41
2 0x38dca887 in Poco::Util::Application::run() build_docker/../contrib/poco/Util/src/Application.cpp:334:8
3 0xde8d72c in mainEntryClickHouseDisks(int, char**) build_docker/../programs/disks/DisksApp.cpp:219:20
4 0xd8cf47f in main build_docker/../programs/main.cpp:445:12
5 0x7f060ddce082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)
CI: https://s3.amazonaws.com/clickhouse-test-reports/39299/37b4b52c12698e711aa931f10aec3909bca287b6/integration_tests__asan__actions__[2/3].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
A simple HelloWorld program with zero includes except iostream triggers
a build of ca. 2000 source files. The reason is that ClickHouse's
top-level CMakeLists.txt overrides "add_executable()" to link all
binaries against "clickhouse_new_delete". This links against
"clickhouse_common_io", which in turn has lots of 3rd party library
dependencies ... Without linking "clickhouse_new_delete", the number of
compiled files for "HelloWorld" goes down to ca. 70.
As an example, the self-extracting-executable needs none of its current
dependencies but other programs may also benefit.
In order to restore access to the original "add_executable()", the
overriding version is now prefixed. There is precedence for a
"clickhouse_" prefix (as opposed to "ch_"), for example
"clickhouse_split_debug_symbols". In general prefixing makes sense also
because overriding CMake commands relies on undocumented behavior and is
considered not-so-great practice (*).
(*) https://crascit.com/2018/09/14/do-not-redefine-cmake-commands/
- TSA is a static analyzer build by Google which finds race conditions
and deadlocks at compile time.
- It works by associating a shared member variable with a
synchronization primitive that protects it. The compiler can then
check at each access if proper locking happened before. A good
introduction are [0] and [1].
- TSA requires some help by the programmer via annotations. Luckily,
LLVM's libcxx already has annotations for std::mutex, std::lock_guard,
std::shared_mutex and std::scoped_lock. This commit enables them
(--> contrib/libcxx-cmake/CMakeLists.txt).
- Further, this commit adds convenience macros for the low-level
annotations for use in ClickHouse (--> base/defines.h). For
demonstration, they are leveraged in a few places.
- As we compile with "-Wall -Wextra -Weverything", the required compiler
flag "-Wthread-safety-analysis" was already enabled. Negative checks
are an experimental feature of TSA and disabled
(--> cmake/warnings.cmake). Compile times did not increase noticeably.
- TSA is used in a few places with simple locking. I tried TSA also
where locking is more complex. The problem was usually that it is
unclear which data is protected by which lock :-(. But there was
definitely some weird code where locking looked broken. So there is
some potential to find bugs.
*** Limitations of TSA besides the ones listed in [1]:
- The programmer needs to know which lock protects which piece of shared
data. This is not always easy for large classes.
- Two synchronization primitives used in ClickHouse are not annotated in
libcxx:
(1) std::unique_lock: A releaseable lock handle often together with
std::condition_variable, e.g. in solve producer-consumer problems.
(2) std::recursive_mutex: A re-entrant mutex variant. Its usage can be
considered a design flaw + typically it is slower than a standard
mutex. In this commit, one std::recursive_mutex was converted to
std::mutex and annotated with TSA.
- For free-standing functions (e.g. helper functions) which are passed
shared data members, it can be tricky to specify the associated lock.
This is because the annotations use the normal C++ rules for symbol
resolution.
[0] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
[1] https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42958.pdf
cmake/target.cmake defines macros for the supported platforms, this
commit changes predefined system macros to our own macros.
__linux__ --> OS_LINUX
__APPLE__ --> OS_DARWIN
__FreeBSD__ --> OS_FREEBSD
- changed config.xml/yaml files used by CH's own internal tests which
are (hopefully) not sensitive to mark_cache_size being set or not
- further occurrences exist but changing them seems a bad idea (e.g.
because they are in customer-provided data)
I played around with my local config.xml file. The minimal working example is this:
<?xml version="1.0"?>
<clickhouse>
<mark_cache_size>5368709120</mark_cache_size>
<listen_host>localhost</listen_host>
<tcp_port>9000</tcp_port>
<users_config>users.xml</users_config>
<logger><console>true</console></logger>
</clickhouse>
Not specifying mark_cache_size made the server not start up:
2022.05.18 12:15:06.549078 [ 8728320 ] {} <Error> Application: Not found: mark_cache_size
Looking at ClickHouse's ca. 100 server configuration options +
sub-options, it seems that mark_cache_size is NOT special enough to
require explicit configuration but instead that the behavior was
unintended because no default value was provided.
If you will execute 'SYSTEM RELOAD CONFIG' via, i.e., TCP protocol, then
reload on port change will endlessly wait for connection from which this
query had been issued, and you will see the following message in the
logs:
2022.04.28 03:34:57.552513 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Debug> executeQuery: (from 127.0.0.1:11774) system reload config
...
2022.04.28 03:34:57.710640 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Information> Application: Stopped listening for http://127.0.0.1:18123
2022.04.28 03:34:57.798774 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Information> Application: Stopped listening for native protocol (tcp): 127.0.0.1:19000
...
2022.04.28 03:34:57.901375 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Debug> Application: Server finished: http://127.0.0.1:18123
2022.04.28 03:34:57.901455 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Trace> Application: Waiting server to finish: native protocol (tcp): 127.0.0.1:19000
2022.04.28 03:34:58.001717 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Trace> Application: Waiting server to finish: native protocol (tcp): 127.0.0.1:19000
2022.04.28 03:34:58.101881 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Trace> Application: Waiting server to finish: native protocol (tcp): 127.0.0.1:19000
...
2022.04.28 03:35:01.707951 [ 37101 ] {b41d855c-4dbf-470a-a144-c6ae5a1abda8} <Trace> Application: Waiting server to finish: native protocol (tcp): 127.0.0.1:19000
But waiting for the current connection will never ends.
So instead of waiting directly from the query context (SYSTEM RELOAD
CONFIG) do this in background (actually not even in background, but
check on server reload and on exit).
v0: just don't wait for the servers
v2: fix use-after-free by removing dependency from server in handlers
v3: wait servers in background to avoid use-after-free of the context
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
In case you have different roles for the same user on multiple clusters,
ON CLUSTER query can help to overcome some limitations.
Consider the following example:
- cluster_with_data, dev_user (readonly=2)
- stage_cluster, dev_user (readonly=0)
So when you will execute the following query from stage_cluster, it will
be successfully executed, since ON CLUSTER queries has different system
profile:
DROP DATABASE default ON CLUSTER cluster_with_data
This is not 100% safe, but at least something.
Note, that right now only ON CLUSTER query it self is supported, but
separate clusters are not (i.e. GRANT CLUSTER some_cluster_name TO
default), since right now grants sticked to database+.
v2: on_cluster_queries_require_cluster_grant
v3: fix test and process flags as bit mask
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Some environments may really require LD_LIBRARY_PATH (and some other
variables), so rejecting running clickhouse binaries in such envs is a
backward incompatible change.
So instead of rejecting, let's ignore those env variables, i.e. reexec
binaries without them.
Also note, that there is no messages in stderr in case of some of
variables set anymore, since this message may break some scripts.
Refs: #36340
Follow-up for: #36342