ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-14 11:33:46 +00:00

Author	SHA1	Message	Date
Sergei Trifonov	f86b352375	Merge pull request #37676 from ClickHouse/verbose-sanity-checks more verbose sanity checks	2022-05-31 17:29:27 +02:00
Alexey Milovidov	4bb04f913f	Fix clang-tidy-14	2022-05-31 17:20:07 +02:00
Antonio Andelic	49f815060a	Use tar for logs	2022-05-31 15:18:44 +00:00
Maksim Kita	bacee7f19c	Merge pull request #37195 from kitaisreal/merging-sorted-algorithm-single-column-specialization MergingSortedAlgorithm single column specialization	2022-05-31 16:46:18 +02:00
Antonio Andelic	3e71a716f5	Enable only jepsen tests	2022-05-31 13:55:01 +00:00
Antonio Andelic	737d6ab1e3	Merge branch 'master' into tiny_fixes_for_jepsen	2022-05-31 13:53:55 +00:00
Antonio Andelic	792adb0576	Update jepsen and scp	2022-05-31 13:53:45 +00:00
Yakov Olkhovskiy	4b427336e3	tests with overridden and appended parameters	2022-05-31 09:37:34 -04:00
Dmitry Novik	b41fe00f31	Merge pull request #37542 from azat/grouping-sets-fix-optimize_aggregation_in_order Prohibit optimize_aggregation_in_order with GROUPING SETS (fixes LOGICAL_ERROR)	2022-05-31 15:31:45 +02:00
Dmitry Novik	f58623a375	Merge pull request #37593 from azat/union-type-cast-resubmit Fix converting types for UNION queries (may produce LOGICAL_ERROR)	2022-05-31 15:27:50 +02:00
mergify[bot]	ba49c6bb46	Merge branch 'master' into memory-overcommit-improvement	2022-05-31 13:17:06 +00:00
alesapin	473b0bd0db	Merge pull request #37604 from ClickHouse/turn_on_s3_tests Turn on s3 tests to red mode	2022-05-31 15:01:24 +02:00
kssenii	c2087b3145	Fix	2022-05-31 14:38:11 +02:00
Kruglov Pavel	7cc87d9a65	Merge pull request #37537 from Avogar/skip-first-lines Allow to skip some of the first lines in CSV/TSV formats	2022-05-31 14:26:21 +02:00
mergify[bot]	d85c3ec69e	Merge branch 'master' into turn_on_s3_tests	2022-05-31 11:58:16 +00:00
Antonio Andelic	582be42329	Wait for leader election	2022-05-31 11:53:46 +00:00
Sergei Trifonov	026e073b0b	minor improvement	2022-05-31 13:50:09 +02:00
Alexander Tokmakov	30a7b07d97	Merge pull request #37658 from vitlibar/fix-flaky-test_row_policy Fix flaky test test_row_policy	2022-05-31 12:59:50 +03:00
alesapin	65057bf8c4	Merge pull request #37616 from ClickHouse/remove-resursive-submodules Remove resursive submodules	2022-05-31 11:58:04 +02:00
Kseniia Sumarokova	73ed9c3977	Merge pull request #37619 from Vxider/wv-fix-table-identifier Fix bugs in WindowView when using table identifier	2022-05-31 11:07:11 +02:00
Robert Schulze	32c810fd35	Merge pull request #37644 from ClickHouse/fix-amqp-cpp-cassandra-dependencies Disable amqp-cpp and cassandra build if libuv is disabled	2022-05-31 10:47:38 +02:00
Robert Schulze	557bb2d235	Disable amqp-cpp and cassandra build if libuv is disabled On MacOS/GCC, the libuv build is disabled due to a compiler bug. This is now propagated to dependent libraries amqp-cpp and cassandra. Oddly enough, the Mac/GCC build was broken since at least Jan 2022 without someone noticing.	2022-05-31 10:34:03 +02:00
Sergei Trifonov	7e95bf31b2	more verbose sanity checks	2022-05-31 09:26:26 +02:00
Yakov Olkhovskiy	873ac9f8ff	Merge pull request #37540 from ClickHouse/feature-server-certificate showCertificate function implementation	2022-05-31 02:50:03 -04:00
Alexey Milovidov	bcbd6b802f	Fix clang-tidy-14	2022-05-31 04:19:08 +02:00
Yakov Olkhovskiy	c6b20cd5ed	Merge pull request #37187 from Algunenano/floating_seconds Allow decimal values in settings using seconds	2022-05-30 20:33:47 -04:00
Anton Popov	30f8eb800a	optimize function coalesce with two arguments	2022-05-30 22:29:35 +00:00
Dmitry Novik	9d04305a5a	Update Settings.h	2022-05-30 23:00:28 +02:00
mergify[bot]	55913cf8e1	Merge branch 'master' into turn_on_s3_tests	2022-05-30 20:52:40 +00:00
Kseniia Sumarokova	18bda56e4c	Merge pull request #37655 from ClickHouse/kssenii-patch-3-1 Fix hung check	2022-05-30 21:22:12 +02:00
mergify[bot]	b43cfd056f	Merge branch 'master' into floating_seconds	2022-05-30 19:18:35 +00:00
Nikolai Kochetov	77b07dd0a8	Merge pull request #37163 from ClickHouse/grouping-function Add GROUPING function	2022-05-30 20:45:04 +02:00
Robert Schulze	ad12adc31c	Measure and rework internal re2 caching This commit is based on local benchmarks of ClickHouse's re2 caching. Question 1: ----------------------------------------------------------- Is pattern caching useful for queries with const LIKE/REGEX patterns? E.g. SELECT LIKE(col_haystack, '%HelloWorld') FROM T; The short answer is: no. Runtime is (unsurprisingly) dominated by pattern evaluation + other stuff going on in queries, but definitely not pattern compilation. For space reasons, I omit details of the local experiments. (Side note: the current caching scheme is unbounded in size which poses a DoS risk (think of multi-tenancy). This risk is more pronounced when unbounded caching is used with non-const patterns ..., see next question) Question 2: ----------------------------------------------------------- Is pattern caching useful for queries with non-const LIKE/REGEX patterns? E.g. SELECT LIKE(col_haystack, col_needle) FROM T; I benchmarked five caching strategies: 1. no caching as a baseline (= recompile for each row) 2. unbounded cache (= threadsafe global hash-map) 3. LRU cache (= threadsafe global hash-map + LRU queue) 4. lightweight local cache 1 (= not threadsafe local hashmap with collision list which grows to a certain size (here: 10 elements) and afterwards never changes) 5. lightweight local cache 2 (not threadsafe local hashmap without collision list in which a collision replaces the stored element, idea by Alexey) ... using a haystack of 2 mio strings and A). 2 mio distinct simple patterns B). 10 simple patterns C) 2 mio distinct complex patterns D) 10 complex patterns Fo A) and C), caching does not help but these queries still allow to judge the static overhead of caching on query runtimes. B) and D) are extreme but common cases in practice. They include queries like "SELECT ... WHERE LIKE (col_haystack, flag ? '%pattern1%' : '%pattern2%'). Caching should help significantly. Because LIKE patterns are internally translated to re2 expressions, I show only measurements for MATCH queries. Results in sec, averaged over on multiple measurements; 1.A): 2.12 B): 1.68 C): 9.75 D): 9.45 2.A): 2.17 B): 1.73 C): 9.78 D): 9.47 3.A): 9.8 B): 0.63 C): 31.8 D): 0.98 4.A): 2.14 B): 0.29 C): 9.82 D): 0.41 5.A) 2.12 / 2.15 / 2.26 B) 1.51 / 0.43 / 0.30 C) 9.97 / 9.88 / 10.13 D) 5.70 / 0.42 / 0.43 (10/100/1000 buckets, resp. 10/1/0.1% collision rate) Evaluation: 1. This is the baseline. It was surprised that complex patterns (C, D) slow down the queries so badly compared to simple patterns (A, B). The runtime includes evaluation costs, but as caching only helps with compilation, and looking at 4.D and 5.D, compilation makes up over 90% of the runtime! 2. No speedup compared to 1, probably due to locking overhead. The cache is unbounded, and in experiments with data sets > 2 mio rows, 2. is the only scheme to throw OOM exceptions which is not acceptable. 3. Unique patterns (A and C) lead to thrashing of the LRU cache and very bad runtimes due to LRU queue maintenance and locking. Works pretty well however with few distinct patterns (B and D). 4. This scheme is tailored to queries B and D where it performs pretty good. More importantly, the caching is lightweight enough to not deteriorate performance on datasets A and C. 5. After some tuning of the hash map size, 100 buckets seem optimal to be in the same ballpark with 10 distinct patterns as 4. Performance also does not deteriorate on A and C compared to the baseline. Unlike 4., this scheme behaves LRU-like and can adjust to changing pattern distributions. As a conclusion, this commit implementes two things: 1. Based on Q1, pattern search with const needle no longer uses caching. This applies to LIKE and MATCH + a few (exotic) other SQL functions. The code for the unbounded caching was removed. 2. Based on Q2, pattern search with non-const needles now use method 5.	2022-05-30 20:00:35 +02:00
Anton Popov	52d3791eb9	Merge pull request #37600 from CurtizJ/fix-with-fill-interval Fix `WITH FILL` with negative intervals in `STEP` clause	2022-05-30 19:43:12 +02:00
alesapin	60b910a4de	Fix	2022-05-30 19:04:25 +02:00
alesapin	6db44f633f	Merge pull request #37641 from azat/keeper-list-watches keeper: store only unique session IDs for watches (should fix SIGKILL in stress tests)	2022-05-30 18:55:52 +02:00
mergify[bot]	d4e722bbfa	Merge branch 'master' into http-named-collection	2022-05-30 16:40:18 +00:00
Vitaly Baranov	486a11a5e2	Fix flaky test test_row_policy.	2022-05-30 18:34:28 +02:00
vdimir	8a3f4bda62	Fix columns number mismatch in cross join	2022-05-30 15:40:15 +00:00
Kseniia Sumarokova	b1ba7b7027	Fix build	2022-05-30 17:30:59 +02:00
Kseniia Sumarokova	1869adfd7d	Update FileCache.cpp	2022-05-30 16:06:58 +02:00
Amos Bird	217f492264	Fix test	2022-05-30 21:38:24 +08:00
Amos Bird	b68e8efaf0	Fix joinGet with cannot be null type	2022-05-30 21:01:27 +08:00
Alexander Tokmakov	351956d108	Merge pull request #37640 from azat/transaction-fix Fix excessive LIST requests to coordinator for transactions	2022-05-30 15:46:52 +03:00
mergify[bot]	87e896ae05	Merge branch 'master' into try_fix_tests	2022-05-30 12:44:35 +00:00
alesapin	84ed5aa6b0	No recursive in CI	2022-05-30 14:41:27 +02:00
Kruglov Pavel	0615866aea	Merge pull request #37450 from Avogar/check-format-on-storage-creation Check format name on storage creation	2022-05-30 14:23:20 +02:00
avogar	139a7e19a9	Fix comments	2022-05-30 11:43:29 +00:00
alesapin	87baabb1a8	Followup fix	2022-05-30 13:34:42 +02:00
alesapin	362fa745e6	Ignore broken metadata	2022-05-30 13:32:12 +02:00

1 2 3 4 5 ...

90065 Commits