ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-15 12:14:18 +00:00

Author	SHA1	Message	Date
Yakov Olkhovskiy	c6b20cd5ed	Merge pull request #37187 from Algunenano/floating_seconds Allow decimal values in settings using seconds	2022-05-30 20:33:47 -04:00
Anton Popov	30f8eb800a	optimize function coalesce with two arguments	2022-05-30 22:29:35 +00:00
Dmitry Novik	9d04305a5a	Update Settings.h	2022-05-30 23:00:28 +02:00
mergify[bot]	55913cf8e1	Merge branch 'master' into turn_on_s3_tests	2022-05-30 20:52:40 +00:00
Kseniia Sumarokova	18bda56e4c	Merge pull request #37655 from ClickHouse/kssenii-patch-3-1 Fix hung check	2022-05-30 21:22:12 +02:00
mergify[bot]	b43cfd056f	Merge branch 'master' into floating_seconds	2022-05-30 19:18:35 +00:00
Nikolai Kochetov	77b07dd0a8	Merge pull request #37163 from ClickHouse/grouping-function Add GROUPING function	2022-05-30 20:45:04 +02:00
Robert Schulze	ad12adc31c	Measure and rework internal re2 caching This commit is based on local benchmarks of ClickHouse's re2 caching. Question 1: ----------------------------------------------------------- Is pattern caching useful for queries with const LIKE/REGEX patterns? E.g. SELECT LIKE(col_haystack, '%HelloWorld') FROM T; The short answer is: no. Runtime is (unsurprisingly) dominated by pattern evaluation + other stuff going on in queries, but definitely not pattern compilation. For space reasons, I omit details of the local experiments. (Side note: the current caching scheme is unbounded in size which poses a DoS risk (think of multi-tenancy). This risk is more pronounced when unbounded caching is used with non-const patterns ..., see next question) Question 2: ----------------------------------------------------------- Is pattern caching useful for queries with non-const LIKE/REGEX patterns? E.g. SELECT LIKE(col_haystack, col_needle) FROM T; I benchmarked five caching strategies: 1. no caching as a baseline (= recompile for each row) 2. unbounded cache (= threadsafe global hash-map) 3. LRU cache (= threadsafe global hash-map + LRU queue) 4. lightweight local cache 1 (= not threadsafe local hashmap with collision list which grows to a certain size (here: 10 elements) and afterwards never changes) 5. lightweight local cache 2 (not threadsafe local hashmap without collision list in which a collision replaces the stored element, idea by Alexey) ... using a haystack of 2 mio strings and A). 2 mio distinct simple patterns B). 10 simple patterns C) 2 mio distinct complex patterns D) 10 complex patterns Fo A) and C), caching does not help but these queries still allow to judge the static overhead of caching on query runtimes. B) and D) are extreme but common cases in practice. They include queries like "SELECT ... WHERE LIKE (col_haystack, flag ? '%pattern1%' : '%pattern2%'). Caching should help significantly. Because LIKE patterns are internally translated to re2 expressions, I show only measurements for MATCH queries. Results in sec, averaged over on multiple measurements; 1.A): 2.12 B): 1.68 C): 9.75 D): 9.45 2.A): 2.17 B): 1.73 C): 9.78 D): 9.47 3.A): 9.8 B): 0.63 C): 31.8 D): 0.98 4.A): 2.14 B): 0.29 C): 9.82 D): 0.41 5.A) 2.12 / 2.15 / 2.26 B) 1.51 / 0.43 / 0.30 C) 9.97 / 9.88 / 10.13 D) 5.70 / 0.42 / 0.43 (10/100/1000 buckets, resp. 10/1/0.1% collision rate) Evaluation: 1. This is the baseline. It was surprised that complex patterns (C, D) slow down the queries so badly compared to simple patterns (A, B). The runtime includes evaluation costs, but as caching only helps with compilation, and looking at 4.D and 5.D, compilation makes up over 90% of the runtime! 2. No speedup compared to 1, probably due to locking overhead. The cache is unbounded, and in experiments with data sets > 2 mio rows, 2. is the only scheme to throw OOM exceptions which is not acceptable. 3. Unique patterns (A and C) lead to thrashing of the LRU cache and very bad runtimes due to LRU queue maintenance and locking. Works pretty well however with few distinct patterns (B and D). 4. This scheme is tailored to queries B and D where it performs pretty good. More importantly, the caching is lightweight enough to not deteriorate performance on datasets A and C. 5. After some tuning of the hash map size, 100 buckets seem optimal to be in the same ballpark with 10 distinct patterns as 4. Performance also does not deteriorate on A and C compared to the baseline. Unlike 4., this scheme behaves LRU-like and can adjust to changing pattern distributions. As a conclusion, this commit implementes two things: 1. Based on Q1, pattern search with const needle no longer uses caching. This applies to LIKE and MATCH + a few (exotic) other SQL functions. The code for the unbounded caching was removed. 2. Based on Q2, pattern search with non-const needles now use method 5.	2022-05-30 20:00:35 +02:00
Alexander Gololobov	e2dd6f6249	Removed prewhere_info.alias_actions	2022-05-30 19:58:23 +02:00
Han Fei	e15cdec39c	address comments	2022-05-31 01:46:31 +08:00
Anton Popov	52d3791eb9	Merge pull request #37600 from CurtizJ/fix-with-fill-interval Fix `WITH FILL` with negative intervals in `STEP` clause	2022-05-30 19:43:12 +02:00
alesapin	60b910a4de	Fix	2022-05-30 19:04:25 +02:00
alesapin	6db44f633f	Merge pull request #37641 from azat/keeper-list-watches keeper: store only unique session IDs for watches (should fix SIGKILL in stress tests)	2022-05-30 18:55:52 +02:00
mergify[bot]	d4e722bbfa	Merge branch 'master' into http-named-collection	2022-05-30 16:40:18 +00:00
Vitaly Baranov	486a11a5e2	Fix flaky test test_row_policy.	2022-05-30 18:34:28 +02:00
Han Fei	af86900c52	Merge branch 'hanfei/zk-write' of github.com:hanfei1991/ClickHouse into hanfei/zk-write	2022-05-31 00:17:38 +08:00
Han Fei	42fca8d5f0	address comments	2022-05-31 00:17:32 +08:00
Han Fei	a464b10afe	Update src/Storages/System/StorageSystemZooKeeper.cpp Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>	2022-05-30 23:52:37 +08:00
Han Fei	194445646a	Update src/Storages/System/StorageSystemZooKeeper.cpp Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>	2022-05-30 23:47:43 +08:00
vdimir	8a3f4bda62	Fix columns number mismatch in cross join	2022-05-30 15:40:15 +00:00
Kseniia Sumarokova	b1ba7b7027	Fix build	2022-05-30 17:30:59 +02:00
Kseniia Sumarokova	1869adfd7d	Update FileCache.cpp	2022-05-30 16:06:58 +02:00
Amos Bird	217f492264	Fix test	2022-05-30 21:38:24 +08:00
Amos Bird	b68e8efaf0	Fix joinGet with cannot be null type	2022-05-30 21:01:27 +08:00
Alexander Tokmakov	351956d108	Merge pull request #37640 from azat/transaction-fix Fix excessive LIST requests to coordinator for transactions	2022-05-30 15:46:52 +03:00
mergify[bot]	87e896ae05	Merge branch 'master' into try_fix_tests	2022-05-30 12:44:35 +00:00
alesapin	84ed5aa6b0	No recursive in CI	2022-05-30 14:41:27 +02:00
Kruglov Pavel	0615866aea	Merge pull request #37450 from Avogar/check-format-on-storage-creation Check format name on storage creation	2022-05-30 14:23:20 +02:00
avogar	139a7e19a9	Fix comments	2022-05-30 11:43:29 +00:00
alesapin	87baabb1a8	Followup fix	2022-05-30 13:34:42 +02:00
alesapin	362fa745e6	Ignore broken metadata	2022-05-30 13:32:12 +02:00
Amos Bird	6525bfc4cd	Avoid context copy for InterpreterSelects	2022-05-30 17:08:12 +08:00
Maksim Kita	55f1faf77d	Merge pull request #37637 from ucasfl/typo fix typo	2022-05-30 11:03:35 +02:00
Maksim Kita	53d848f187	Fixed tests	2022-05-30 10:53:04 +02:00
Azat Khuzhin	d86181d3cd	keeper: store only unique session IDs for watches This should speed up keeper, especially in case of incorrect usage (like the case that had been fixed in #37640), especially in case on non release build. And also this should fix SIGKILL in stress tests. You will find some details for one of such SIGKILL in `<details>` tag [1]: <details> $ pigz -cd clickhouse-server.stress.log.gz \| tail 2022.05.27 16:17:24.882971 [ 637 ] {} <Trace> BackgroundSchedulePool/BgSchPool: Waiting for threads to finish. 2022.05.27 16:17:24.896749 [ 637 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 4.09 MiB. 2022.05.27 16:17:24.907163 [ 637 ] {} <Debug> Application: Shut down storages. 2022.05.27 16:17:24.907233 [ 637 ] {} <Debug> Application: Waiting for current connections to servers for tables to finish. 2022.05.27 16:17:24.934335 [ 637 ] {} <Information> Application: Closed all listening sockets. Waiting for 1 outstanding connections. 2022.05.27 16:17:29.843491 [ 637 ] {} <Information> Application: Closed connections to servers for tables. But 1 remain. Probably some tables of other users cannot finish their connections after context shutdown. 2022.05.27 16:17:29.843632 [ 637 ] {} <Debug> KeeperDispatcher: Shutting down storage dispatcher 2022.05.27 16:17:34.612616 [ 688 ] {} <Test> virtual Coordination::ZooKeeperRequest::~ZooKeeperRequest(): Processing of request xid=2147483647 took 10000 ms 2022.05.27 16:17:54.612109 [ 3176 ] {} <Debug> KeeperTCPHandler: Session #12 expired 2022.05.27 16:19:59.823038 [ 635 ] {} <Fatal> Application: Child process was terminated by signal 9 (KILL). If it is not done by 'forcestop' command or manually, the possible cause is OOM Killer (see 'dmesg' and look at the '/var/log/kern.log' for the details). Thread 26 (Thread 0x7f1c7703f700 (LWP 708)): 0 0x000000000b074b2a in __tsan::MemoryAccessImpl(__tsan::ThreadState, unsigned long, int, bool, bool, unsigned long long, __tsan::Shadow) () 1 0x000000000b08630c in __tsan::MemoryAccessRange(__tsan::ThreadState, unsigned long, unsigned long, unsigned long, bool) () 2 0x000000000b01ff03 in memmove () 3 0x000000001bbc8996 in std::__1::__move<long, long> (__first=0xb8600000d83304, __last=<optimized out>, __result=0x7f1c021cd000) at ../contrib/libcxx/include/__algorithm/move.h:57 4 std::__1::move<long, long> (__first=0xb8600000d83304, __last=<optimized out>, __result=0x7f1c021cd000) at ../contrib/libcxx/include/__algorithm/move.h:70 5 std::__1::vector<long, std::__1::allocator<long> >::erase (this=0x7b1400584c48, __position=...) at ../contrib/libcxx/include/vector:1608 6 DB::KeeperStorage::clearDeadWatches (this=0x7b5800001ad8, this@entry=0x7b5800001800, session_id=session_id@entry=12) at ../src/Coordination/KeeperStorage.cpp:1228 7 0x000000001bbc5c55 in DB::KeeperStorage::processRequest (this=0x7b5800001800, zk_request=..., session_id=12, time=1, new_last_zxid=..., check_acl=true) at ../src/Coordination/KeeperStorage.cpp:1122 8 0x000000001bba06a3 in DB::KeeperStateMachine::commit (this=<optimized out>, log_idx=3549, data=...) at ../src/Coordination/KeeperStateMachine.cpp:143 9 0x000000001bba6193 in nuraft::state_machine::commit_ext (this=0x7b4c00001f98, params=...) at ../contrib/NuRaft/include/libnuraft/state_machine.hxx:75 10 0x00000000202c5a55 in nuraft::raft_server::commit_app_log (this=this@entry=0x7b6c00002a18, idx_to_commit=idx_to_commit@entry=3549, le=..., need_to_handle_commit_elem=true, initial_commit_exec=false) at ../contrib/NuRaft/src/handle_commit.cxx:311 11 0x00000000202c4f98 in nuraft::raft_server::commit_in_bg_exec (this=<optimized out>, this@entry=0x7b6c00002a18, timeout_ms=timeout_ms@entry=0, initial_commit_exec=false) at ../contrib/NuRaft/src/handle_commit.cxx:241 12 0x00000000202c4613 in nuraft::raft_server::commit_in_bg (this=this@entry=0x7b6c00002a18) at ../contrib/NuRaft/src/handle_commit.cxx:149 ... Thread 28 (Thread 0x7f1c7603d700 (LWP 710)): 0 0x00007f1d22a6d110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 1 0x00007f1d22a650a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 2 0x000000000b0337b0 in pthread_mutex_lock () 3 0x00000000221884da in std::__1::__libcpp_mutex_lock (__m=0x7b4c00002088) at ../contrib/libcxx/include/__threading_support:303 4 std::__1::mutex::lock (this=0x7b4c00002088) at ../contrib/libcxx/src/mutex.cpp:33 5 0x000000001bba4188 in std::__1::lock_guard<std::__1::mutex>::lock_guard (__m=..., this=<optimized out>) at ../contrib/libcxx/include/__mutex_base:91 6 DB::KeeperStateMachine::getDeadSessions (this=0x7b4c00001f98) at ../src/Coordination/KeeperStateMachine.cpp:360 7 0x000000001bb79b4b in DB::KeeperServer::getDeadSessions (this=0x7b4400012700) at ../src/Coordination/KeeperServer.cpp:572 8 0x000000001bb64d1a in DB::KeeperDispatcher::sessionCleanerTask (this=<optimized out>, this@entry=0x7b640001c218) at ../src/Coordination/KeeperDispatcher.cpp:399 ... Thread 1 (Thread 0x7f1d227148c0 (LWP 637)): 0 0x00007f1d22a69376 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 1 0x000000000b0895e0 in __tsan::call_pthread_cancel_with_cleanup(int ()(void), void ()(void), void) () 2 0x000000000b017091 in pthread_cond_wait () 3 0x0000000020569d98 in Poco::EventImpl::waitImpl (this=0x7b2000008798) at ../contrib/poco/Foundation/src/Event_POSIX.cpp:106 4 0x000000001bb636cf in Poco::Event::wait (this=0x7b2000008798) at ../contrib/poco/Foundation/include/Poco/Event.h:97 5 ThreadFromGlobalPool::join (this=<optimized out>) at ../src/Common/ThreadPool.h:217 6 DB::KeeperDispatcher::shutdown (this=0x7b640001c218) at ../src/Coordination/KeeperDispatcher.cpp:322 7 0x0000000019ca8bfc in DB::Context::shutdownKeeperDispatcher (this=<optimized out>) at ../src/Interpreters/Context.cpp:2111 8 0x000000000b0a979b in DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&)::$_9::operator()() const (this=0x7ffcde44f0a0) at ../programs/server/Server.cpp:1407 </details> [1]: https://s3.amazonaws.com/clickhouse-test-reports/37593/2613149f6bf4f242bbbf2c3c8539b5176fd77286/stress_test__thread__actions_.html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-30 11:50:48 +03:00
Azat Khuzhin	78bd47d8df	Fix excessive LIST requests to coordinator for transactions In [1] there was only few transactions, but lots of List for /test/clickhouse/txn/log: $ clickhouse-local --format TSVWithNamesAndTypes --file zookeeper_log.tsv.gz -q "select * except('path\|session_id\|event_time\|thread_id\|event_date\|xid') apply(x->groupUniqArray(x)), path, session_id, min(event_time), max(event_time), count() from table where has_watch and type = 'Request' group by path, session_id order by count() desc limit 1 format Vertical" Row 1: ────── groupUniqArray(type): ['Request'] groupUniqArray(query_id): ['','62d75128-9031-48a5-87ba-aec3f0b591c6'] groupUniqArray(address): ['::1'] groupUniqArray(port): [9181] groupUniqArray(has_watch): [1] groupUniqArray(op_num): ['List'] groupUniqArray(data): [''] groupUniqArray(is_ephemeral): [0] groupUniqArray(is_sequential): [0] groupUniqArray(version): [] groupUniqArray(requests_size): [0] groupUniqArray(request_idx): [0] groupUniqArray(error): [] groupUniqArray(watch_type): [] groupUniqArray(watch_state): [] groupUniqArray(stat_version): [0] groupUniqArray(stat_cversion): [0] groupUniqArray(stat_dataLength): [0] groupUniqArray(stat_numChildren): [0] groupUniqArray(children): [[]] path: /test/clickhouse/txn/log session_id: 1 min(event_time): 2022-05-27 12:54:09.025897 max(event_time): 2022-05-27 13:37:12.846314 <!-- last transaction was at 12:54, see server log count(): 3673675 <-- huge [1]: https://s3.amazonaws.com/clickhouse-test-reports/37593/2613149f6bf4f242bbbf2c3c8539b5176fd77286/stateless_tests__debug__actions__[1/3].html Server log: $ pigz -cd clickhouse-server.log.gz \| fgrep TransactionLog: \| head -1 2022.05.27 12:54:09.026852 [ 5018 ] {62d75128-9031-48a5-87ba-aec3f0b591c6} <Trace> TransactionLog: Loading 33 entries from /test/clickhouse/txn/log: csn-0000000000..csn-0000000032 $ pigz -cd clickhouse-server.log.gz \| fgrep TransactionLog: \| tail -1 2022.05.27 12:54:58.909222 [ 509 ] {} <Test> TransactionLog: Closing readonly transaction (177, 38, 41b51ff1-bcba-43bf-bcea-e97ad05f6040) $ pigz -cd clickhouse-server.log.gz \| fgrep 62d75128-9031-48a5-87ba-aec3f0b591c6 \| tail -1 2022.05.27 12:54:09.064857 [ 5018 ] {62d75128-9031-48a5-87ba-aec3f0b591c6} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B. Fixes: #37398 (cc @tavplubix) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-30 11:39:34 +03:00
Sergei Trifonov	2589f78135	Fix misleading arrayFunction description	2022-05-30 10:32:22 +02:00
mergify[bot]	7a2460233b	Merge branch 'master' into try_fix_tests	2022-05-30 08:16:29 +00:00
flynn	9ffa5f5e0d	fix typo	2022-05-30 08:10:16 +00:00
Alexey Milovidov	33b2a50740	Merge pull request #37630 from wangqinghuan/add-clickcat add clickcat intro	2022-05-30 04:25:22 +03:00
qinghuan wang	ded9f5675d	Update gui.md	2022-05-30 09:22:06 +08:00
alesapin	6f35c28592	Fix style	2022-05-30 00:29:30 +02:00
Alexey Milovidov	746ff42239	Update gui.md	2022-05-30 01:13:03 +03:00
alesapin	c32b6076fb	Remove stranges from code	2022-05-30 00:12:33 +02:00
mergify[bot]	8ceafbe1fa	Merge branch 'master' into try_fix_tests	2022-05-29 18:04:44 +00:00
alesapin	753bcee954	Fix lazy mark load	2022-05-29 18:38:09 +02:00
alesapin	aecab57e17	Merge pull request #37629 from ClickHouse/fix_failure Fix refactoring issue	2022-05-29 18:35:54 +02:00
mergify[bot]	df9f7c59a2	Merge branch 'master' into try_fix_tests	2022-05-29 15:04:48 +00:00
alesapin	59a070778d	More quite logging for S3 tests	2022-05-29 14:48:04 +02:00
alesapin	0e8ab36913	Merge branch 'master' into turn_on_s3_tests	2022-05-29 14:37:10 +02:00

... 2 3 4 5 6 ...

90122 Commits