ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-22 23:52:03 +00:00

Author	SHA1	Message	Date
robot-clickhouse	64fa077148	style: fix style	2022-08-29 20:27:06 +00:00
Robert Schulze	4d511332c4	chore: delete obsolete modelEvaluate() function - superseded by catboostEvaluate() which no longer uses the internal repository for external models - also removed was statement SYSTEM RELOAD MODELS and the monitoring view SYSTEM.SYSTEMMODELS	2022-08-29 20:27:06 +00:00
Robert Schulze	6b2b3c1eb3	feat: implement catboost in library-bridge This commit moves the catboost model evaluation out of the server process into the library-bridge binary. This serves two goals: On the one hand, crashes / memory corruptions of the catboost library no longer affect the server. On the other hand, we can forbid loading dynamic libraries in the server (catboost was the last consumer of this functionality), thus improving security. SQL syntax: SELECT catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction, ACTION AS target FROM amazon_train LIMIT 10 Required configuration: <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path> * Implementation Details * The internal protocol between the server and the library-bridge is simple: - HTTP GET on path "/extdict_ping": A ping, used during the handshake to check if the library-bridge runs. - HTTP POST on path "extdict_request" (1) Send a "catboost_GetTreeCount" request from the server to the bridge, containing a library path (e.g /home/user/libcatboost.so) and a model path (e.g. /home/user/model.bin). Rirst, this unloads the catboost library handler associated to the model path (if it was loaded), then loads the catboost library handler associated to the model path, then executes GetTreeCount() on the library handler and finally sends the result back to the server. Step (1) is called once by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The library path handler is unloaded in the beginning because it contains state which may no longer be valid if the user runs catboost("/path/to/model.bin", ...) more than once and if "model.bin" was updated in between. (2) Send "catboost_Evaluate" from the server to the bridge, containing the model path and the features to run the interference on. Step (2) is called multiple times (once per chunk) by the server from function FunctionCatBoostEvaluate::executeImpl(). The library handler for the given model path is expected to be already loaded by Step (1). Fixes #27870	2022-08-29 20:26:45 +00:00
Vitaly Baranov	33f72fb011	Merge pull request #40060 from ClickHouse/vitlibar-increase-timeout-for-test_concurrent_backups Increase timeout for test_concurrent_backups	2022-08-29 22:25:56 +02:00
Dan Roscigno	76a45aa750	Merge branch 'master' into add-backup	2022-08-29 16:23:53 -04:00
Dan Roscigno	8e5e1c5e8c	Merge pull request #40774 from DanRoscigno/replace-zh-symlinks Replace zh symlinks	2022-08-29 15:36:08 -04:00
DanRoscigno	d37029dd82	updates for filename changes	2022-08-29 15:20:28 -04:00
Arthur Passos	dd49b44abb	Fix host_regexp hosts file tst	2022-08-29 15:58:18 -03:00
DanRoscigno	576b7ea604	updates for filename changes	2022-08-29 14:39:15 -04:00
Dmitry Novik	e25ed9547e	Update src/Interpreters/ProcessList.h	2022-08-29 20:26:37 +02:00
Denny Crane	29e7414697	Update merge-tree-settings.md	2022-08-29 15:25:46 -03:00
Dmitry Novik	865ee5d0d6	Refactor code	2022-08-29 20:24:35 +02:00
Denny Crane	19c3a9c6bf	Update external-dicts-dict-layout.md	2022-08-29 15:20:46 -03:00
Denny Crane	fe0f18f21d	Update external-dicts-dict-layout.md	2022-08-29 15:19:15 -03:00
Arthur Passos	961365c7a4	Fix CaresPTRResolver not reading hosts file	2022-08-29 15:11:39 -03:00
DanRoscigno	687ac1805a	updates for filename changes	2022-08-29 13:59:51 -04:00
Dmitry Novik	1169315580	Add OvercommitTracker blocking	2022-08-29 19:44:05 +02:00
Maksim Kita	88141cae98	Merge pull request #40732 from azat/thread-status-fix-leak Fix memory leak while pushing to MVs w/o query context (from Kafka/...)	2022-08-29 19:36:25 +02:00
Kseniia Sumarokova	0fd961acd4	Merge branch 'master' into fix-race	2022-08-29 19:33:38 +02:00
Kseniia Sumarokova	c5c48e44ea	Merge branch 'master' into fix-mysql-timeouts	2022-08-29 19:33:29 +02:00
kssenii	db2bc31e17	Remove incorrect assertion	2022-08-29 19:32:47 +02:00
Konstantin Morozov	d185b7a332	refactoring: public ctors	2022-08-29 20:19:20 +03:00
FArthur-cmd	862b53b06f	Merge branch 'annoy-2' of https://github.com/Vector-Similarity-Search-for-ClickHouse/ClickHouse into annoy-2	2022-08-29 16:43:39 +00:00
FArthur-cmd	3305af8db2	fix case when query is already matched	2022-08-29 16:43:24 +00:00
DanRoscigno	76a3212fc8	replace symlinks	2022-08-29 12:26:17 -04:00
DanRoscigno	c4b8137d31	replace symlinks	2022-08-29 12:19:50 -04:00
Alexander Tokmakov	eb87e3df16	Merge pull request #40749 from ClickHouse/tavplubix-patch-3 Enable `show_addresses_in_stack_traces` by default	2022-08-29 19:16:36 +03:00
kssenii	545c6c8be4	Fix	2022-08-29 17:50:27 +02:00
Dmitry Novik	cfe509c3de	Block overcommit tracker in ProcessList near allocations	2022-08-29 17:49:01 +02:00
robot-clickhouse	92c14e80f1	Update version_date.tsv and changelogs after v22.3.12.19-lts	2022-08-29 14:52:19 +00:00
Alexander Tokmakov	ff2db8e2a7	update submodule	2022-08-29 16:46:21 +02:00
robot-clickhouse	57980161c9	Update version_date.tsv and changelogs after v22.6.7.7-stable	2022-08-29 14:44:03 +00:00
Filatenkov Artur	d73f661732	Merge branch 'master' into annoy-2	2022-08-29 17:33:13 +03:00
robot-clickhouse	4a229ad08c	Update version_date.tsv and changelogs after v22.7.5.13-stable	2022-08-29 14:29:06 +00:00
kssenii	b1dab84d97	Review fixes	2022-08-29 16:23:14 +02:00
kssenii	0a6c4b9265	Fix	2022-08-29 16:20:53 +02:00
robot-clickhouse	764e2e5ac8	Update version_date.tsv and changelogs after v22.8.3.13-lts	2022-08-29 14:05:36 +00:00
kssenii	877ade9a50	Merge remote-tracking branch 'upstream/master' into fix-race	2022-08-29 16:05:27 +02:00
Alexander Tokmakov	1c6dea52e0	Update config.xml	2022-08-29 15:50:05 +03:00
Vladimir C	5cbe7e0846	Merge pull request #40548 from ClickHouse/vdimir/warn-suppress-40330 Add config option warning_supress_regexp	2022-08-29 14:02:00 +02:00
Alexander Tokmakov	a16d4dd605	Merge pull request #40747 from ClickHouse/revert-40710-DWARF-5 Revert "Support for DWARF-5 in in house DWARF parser"	2022-08-29 14:26:24 +03:00
Alexander Tokmakov	69387acffa	Revert "Support for DWARF-5 in in house DWARF parser"	2022-08-29 14:25:53 +03:00
Alexander Tokmakov	8d90d30d37	Merge pull request #40589 from ClickHouse/remove_wrong_code_from_mutations Remove wrong code for skipping mutations in MergeTree	2022-08-29 14:18:59 +03:00
Alexander Tokmakov	eda0582ec0	Merge pull request #40641 from ClickHouse/fix_startup_of_dropped_replica Do not try to strartup dropped replica	2022-08-29 14:15:15 +03:00
Vitaly Baranov	2bec3d3a7c	Increase timeout for test_concurrent_backups	2022-08-29 13:13:43 +02:00
Azat Khuzhin	f9812d9917	Fix memory leak while pushing to MVs w/o query context (from Kafka/...) While pushign to MVs, there is a low-level code that create ThreadGroupStatus/ThreadStatus, it is required to gather some metrics for system.query_views_log. But, one should not use ThreadGroupStatus of the MainThreadStatus, since this structure can hold some state, that may not be cleaned, plus this may be racy, instead it is better to create new ThreadGroupStatus and attach it instead. Also this place misses detachQuery(), and because of this it leaks ThreadGroupStatus::finished_threads_counters_memory. But it is only the problem pushing to MVs is done w/o query context (i.e. from Kafka/...), since when it has query context detachQuery() will be called eventually. Before this patch series, when I've tried the reproducer with 500 MVs attached to Kafka engine (that @den-crane suggested), jemalloc report looks like this: $ ../jeprof --text ~/ch/tmp/upstream/clickhouse-binary --base jeprof.44384.0.i0.heap jeprof.44384.167.i167.heap Using local file /home/azat/ch/tmp/upstream/clickhouse-binary. Using local file jeprof.44384.167.i167.heap. Total: 915.6 MB 910.7 99.5% 99.5% 910.7 99.5% Snapshot (inline) 9.5 1.0% 100.5% 9.5 1.0% std::__1::__libcpp_operator_new (inline) 0.5 0.1% 100.6% 0.5 0.1% DB::TasksStatsCounters::create And with focus to this place: $ ../jeprof --focus Snapshot --text ~/ch/tmp/upstream/clickhouse-binary --base jeprof.44384.0.i0.heap jeprof.44384.167.i167.heap Using local file /home/azat/ch/tmp/upstream/clickhouse-binary. Using local file jeprof.44384.167.i167.heap. Total: 915.6 MB 910.7 100.0% 100.0% 910.7 100.0% Snapshot (inline) 0.0 0.0% 100.0% 910.7 100.0% DB::QueryPipeline::reset 0.0 0.0% 100.0% 910.7 100.0% DB::StorageKafka::streamToViews 0.0 0.0% 100.0% 910.7 100.0% DB::StorageKafka::threadFunc 0.0 0.0% 100.0% 910.7 100.0% ProfileEvents::Counters::getPartiallyAtomicSnapshot 0.0 0.0% 100.0% 910.7 100.0% ~ThreadStatus 0.0 0.0% 100.0% 910.7 100.0% ~ViewRuntimeData 0.0 0.0% 100.0% 910.7 100.0% ~ViewRuntimeStats (inline) Actually this report does not looks great (you understand it because I stripped it), because --text does not that smart, but if you will use --pdf for the report you will see the stacktrace (will attach pdf to the pull request). But after this patch series the process RSS does not goes beyond ~700MiB. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-08-29 11:36:33 +02:00
Azat Khuzhin	6da5707f8f	Fix possible missing detachQuery() in case of exception in readers This can create leaks, since detachQuery() responsible for cleaning, i.e. ThreadGroupStatus::finished_threads_counters_memory Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-08-29 11:30:17 +02:00
Azat Khuzhin	b16891da8d	Avoid using of ThreadGroupStatus of the MainThreadStatus One should not use MainThreadStatus, since ThreadGroupStatus can hold some states, and it is better not to play with this, since this may create leaks. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-08-29 11:30:17 +02:00
Azat Khuzhin	9fff08eac7	WriteBufferFromS3: remove unused ThreadGroupStatus Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-08-29 11:30:17 +02:00
FArthur-cmd	c6e45fe690	remove build with UBSan	2022-08-29 09:18:15 +00:00

1 2 3 4 5 ...

96489 Commits