ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-24 00:22:29 +00:00

Author	SHA1	Message	Date
Alexey Milovidov	b5f48a7d3f	Merge branch 'master' of github.com:ClickHouse/ClickHouse into llvm-14	2022-06-01 22:09:58 +02:00
Robert Schulze	366f368d06	Disallow LIKE patterns with trailing escape Trailing escape ('ab\') is disallowed in SQL, in standardese: "If an escape character is specified, then [...] If there is not a partitioning of the string PVC into substrings such that each substring has length 1 (one) or 2, no substring of length 1 (one) is the escape character ECV, and each substring of length 2 is the escape character ECV followed by either the escape character ECV, an <underscore> character, or the <percent> character, then an exception condition is raised: data exception - invalid escape sequence." I first thought this is checked already higher up in the stack, at least for const needles, as single trailing backslashes ('ab\') are rejected, but then I realized that ClickHouse quotes by default. I.e., double trailing backslashes ('ab\\') are not rejected but when interpreted as LIKE needle ('ab\') they should.	2022-06-01 21:38:46 +02:00
alesapin	a9af4a80b0	Add first simple implementation	2022-06-01 21:29:10 +02:00
lthaooo	6632616733	Fix TTL merge scheduling bug (#36387 )	2022-06-01 21:09:53 +02:00
Azat Khuzhin	545a56ce45	Fix sinks with onException() handler It is possible to call onException() even after onFinish(), in case of onFinish() throws, and in this case onException() should be no-op for such sinks. Also there can be caveats with PartitionedSync. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-06-01 21:50:30 +03:00
Azat Khuzhin	02af58f41d	Fix possible "Cannot write to finalized buffer" It is still possible to get this error since onException does not finalize format correctly. Here is an example of such error, that was found by CI [1]: <details> [ 2686 ] {fa01bf02-73f6-4f7f-b14f-e725de6d7f9b} <Fatal> : Logical error: 'Cannot write to finalized buffer'. [ 34577 ] {} <Fatal> BaseDaemon: ######################################## [ 34577 ] {} <Fatal> BaseDaemon: (version 22.6.1.1, build id: AB8040A6769E01A0) (from thread 2686) (query_id: fa01bf02-73f6-4f7f-b14f-e725de6d7f9b) (query: insert into test_02302 select number from numbers(10) settings s3_truncate_on_insert=1;) Received signal Aborted (6) [ 34577 ] {} <Fatal> BaseDaemon: [ 34577 ] {} <Fatal> BaseDaemon: Stack trace: 0x7fcbaa5a703b 0x7fcbaa586859 0xfad9bab 0xfad9e05 0xfaf6a3b 0x24a48c7f 0x258fb9b9 0x258f2004 0x258b88f4 0x258b863b 0x2581773d 0x258177ce 0x24bb5e98 0xfad01d6 0xfad0105 0x2419b11d 0xfad01d6 0xfad0105 0x2215afbb 0x2215aa48 0xfad01d6 0xfad0105 0xfcc265d 0x225cc546 0x249a1c40 0x249bc1b6 0x2685902c 0x26859505 0x269d7767 0x269d504c 0x7fcbaa75e609 0x7fcbaa683163 [ 34577 ] {} <Fatal> BaseDaemon: 3. raise @ 0x7fcbaa5a703b in ? [ 34577 ] {} <Fatal> BaseDaemon: 4. abort @ 0x7fcbaa586859 in ? [ 34577 ] {} <Fatal> BaseDaemon: 5. ./build_docker/../src/Common/Exception.cpp:47: DB::abortOnFailedAssertion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xfad9bab in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 6. ./build_docker/../src/Common/Exception.cpp:70: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xfad9e05 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 7. ./build_docker/../src/IO/WriteBuffer.h:0: DB::WriteBuffer::write(char const, unsigned long) @ 0xfaf6a3b in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 8. ./build_docker/../src/Processors/Formats/Impl/ArrowBufferedStreams.cpp:47: DB::ArrowBufferedOutputStream::Write(void const, long) @ 0x24a48c7f in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 9. long parquet::ThriftSerializer::Serialize<parquet::format::FileMetaData>(parquet::format::FileMetaData const, arrow::io::OutputStream, std::__1::shared_ptr<parquet::Encryptor> const&) @ 0x258fb9b9 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 10. parquet::FileMetaData::FileMetaDataImpl::WriteTo(arrow::io::OutputStream, std::__1::shared_ptr<parquet::Encryptor> const&) const @ 0x258f2004 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 11. parquet::WriteFileMetaData(parquet::FileMetaData const&, arrow::io::OutputStream) @ 0x258b88f4 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 12. parquet::ParquetFileWriter::~ParquetFileWriter() @ 0x258b863b in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 13. parquet::arrow::FileWriterImpl::~FileWriterImpl() @ 0x2581773d in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 14. parquet::arrow::FileWriterImpl::~FileWriterImpl() @ 0x258177ce in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 15. ./build_docker/../src/Processors/Formats/Impl/ParquetBlockOutputFormat.h:27: DB::ParquetBlockOutputFormat::~ParquetBlockOutputFormat() @ 0x24bb5e98 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 16. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 17. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 18.1. inlined from ./build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:312: std::__1::unique_ptr<DB::WriteBuffer, std::__1::default_delete<DB::WriteBuffer> >::reset(DB::WriteBuffer) [ 34577 ] {} <Fatal> BaseDaemon: 18.2. inlined from ../contrib/libcxx/include/__memory/unique_ptr.h:269: ~unique_ptr [ 34577 ] {} <Fatal> BaseDaemon: 18. ../src/Storages/StorageS3.cpp:566: DB::StorageS3Sink::~StorageS3Sink() @ 0x2419b11d in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 19. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 20. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 21. ./build_docker/../contrib/abseil-cpp/absl/container/internal/raw_hash_set.h:1662: absl::lts_20211102::container_internal::raw_hash_set<absl::lts_20211102::container_internal::FlatHashMapPolicy<StringRef, std::__1::shared_ptr<DB::SinkToStorage> >, absl::lts_20211102::hash_internal::Hash<StringRef>, std::__1::equal_to<StringRef>, std::__1::allocator<std::__1::pair<StringRef const, std::__1::shared_ptr<DB::SinkToStorage> > > >::destroy_slots() @ 0x2215afbb in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 22.1. inlined from ./build_docker/../contrib/libcxx/include/string:1445: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__is_long() const [ 34577 ] {} <Fatal> BaseDaemon: 22.2. inlined from ../contrib/libcxx/include/string:2231: ~basic_string [ 34577 ] {} <Fatal> BaseDaemon: 22. ../src/Storages/PartitionedSink.h:14: DB::PartitionedSink::~PartitionedSink() @ 0x2215aa48 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 23. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 24. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 25. ./build_docker/../contrib/libcxx/include/vector:802: std::__1::vector<std::__1::shared_ptr<DB::IProcessor>, std::__1::allocator<std::__1::shared_ptr<DB::IProcessor> > >::__base_destruct_at_end(std::__1::shared_ptr<DB::IProcessor>) @ 0xfcc265d in /usr/bin/clickhouse [ 34577 ] {} <Fatal> BaseDaemon: 26.1. inlined from ./build_docker/../contrib/libcxx/include/vector:402: ~vector [ 34577 ] {} <Fatal> BaseDaemon: 26.2. inlined from ../src/QueryPipeline/QueryPipeline.cpp:29: ~QueryPipeline [ 34577 ] {} <Fatal> BaseDaemon: 26. ../src/QueryPipeline/QueryPipeline.cpp:535: DB::QueryPipeline::reset() @ 0x225cc546 in /usr/bin/clickhouse [ 614 ] {} <Fatal> Application: Child process was terminated by signal 6. </details> [1]: https://s3.amazonaws.com/clickhouse-test-reports/37542/8a224239c1d922158b4dc9f5d6609dca836dfd06/stress_test__undefined__actions_.html Follow-up for: #36979 Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-06-01 21:50:30 +03:00
Azat Khuzhin	62d78d8f20	Fix WriteBufferFromS3 is_finalized check in case of exception WriteBufferFromS3::is_finalized is not set if finalizeImpl() throws, while WriteBuffer::finalized correctly set even in case of exception, so it should be used instead. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-06-01 21:50:30 +03:00
Alexander Tokmakov	06f80770b8	fix stuck REPALCE_RANGE	2022-06-01 20:11:53 +02:00
Robert Schulze	b3b0716b32	Merge pull request #37544 from ClickHouse/cached_patterns Cache compiled regexps when evaluating non-const needles	2022-06-01 19:55:25 +02:00
Nikolai Kochetov	cc0d5a0daa	Fix test again.	2022-06-01 17:39:12 +00:00
Nikolai Kochetov	9b131f2d2d	Fix tewst again.	2022-06-01 16:56:26 +00:00
avogar	4abfd54dd6	Fix possible segfault in schema inference	2022-06-01 16:53:37 +00:00
Alexey Milovidov	89638de521	Merge pull request #37738 from ClickHouse/fix-intersect-with-const Fix `Intersect` with constant strings	2022-06-01 19:31:55 +03:00
Yakov Olkhovskiy	e23cec01d5	Merge pull request #37581 from ClickHouse/http-named-collection Support for HTTP source for Data Dictionaries in Named Collections	2022-06-01 11:55:04 -04:00
Anton Popov	1ef48c3a4a	turn on setting output_format_json_named_tuples_as_objects by default	2022-06-01 15:42:12 +00:00
avogar	7ef02a2e44	Fix possible logical error in values table function	2022-06-01 15:32:33 +00:00
Nikolai Kochetov	6e924cdc77	Fix some more tests.	2022-06-01 15:21:47 +00:00
Dmitry Novik	7fbe91ca81	Merge pull request #37460 from ClickHouse/memory-overcommit-improvement Memory Overcommit: update defaults, exception message and add ProfileEvent	2022-06-01 17:06:33 +02:00
Sema Checherinda	16dc3ed97d	FR: Expose what triggered the merge in system.part_log #26255	2022-06-01 16:58:07 +02:00
Sema Checherinda	2626a49616	FR: Expose what triggered the merge in system.part_log #26255	2022-06-01 16:58:06 +02:00
Alexander Tokmakov	3d346c766a	better code	2022-06-01 16:49:26 +02:00
HeenaBansal2009	10990402ac	Removed move ctor from EventBase hierarchy	2022-06-01 06:01:22 -07:00
Kseniia Sumarokova	7afcfcbaaf	Merge pull request #37691 from kssenii/fix-rabbitmq-restart-with-no-settings Fix rabbitmq restart with empty settings	2022-06-01 14:59:34 +02:00
flynn	b62e4cec65	Fix crash of FunctionHashID	2022-06-01 12:39:16 +00:00
Alexander Tokmakov	75f49a48e1	Merge branch 'master' into fix_trash	2022-06-01 14:20:46 +02:00
Anton Popov	011131198c	Merge remote-tracking branch 'upstream/master' into fix-mutations-again	2022-06-01 12:04:18 +00:00
Nikolai Kochetov	e401ab8169	Fix more tests.	2022-06-01 11:51:56 +00:00
Antonio Andelic	08c20be4d0	Cleaner exception handling in ParallelReadBuffer	2022-06-01 11:51:01 +00:00
Robert Schulze	ee302f2d9f	Merge pull request #37643 from amosbird/avoid-useless-context-copy Avoid useless context copy when building query interpreters	2022-06-01 13:49:56 +02:00
Antonio Andelic	f49dd19e7a	Revert "Initialize ParallelReadBuffer after construction" This reverts commit `31e1e67836`.	2022-06-01 11:43:58 +00:00
Kruglov Pavel	251be860e7	Merge pull request #37428 from loyd/fix/37420-rowbinary-bom Stop removing UTF-8 BOM in RowBinary format	2022-06-01 13:36:55 +02:00
Vladimir C	8c0dba7302	Merge pull request #37650 from amosbird/joinget-fix Fix joinGet with join_use_nulls = 1 and Array type	2022-06-01 13:30:29 +02:00
Vladimir C	c466cdebf4	Merge pull request #37530 from vdimir/join_cond_dict_issue_37386	2022-06-01 13:29:01 +02:00
Antonio Andelic	ded1398565	Fix intersect with const string	2022-06-01 11:13:33 +00:00
Anton Popov	cae2478b3f	Revert "Merge pull request #37355 from ClickHouse/revert-37266-fix-mutations-with-object" This reverts commit `a53cfa9fca`, reversing changes made to `9acb42fcdb`.	2022-06-01 10:57:20 +00:00
Robert Schulze	600512cc08	Replace exceptions thrown for programming errors by asserts	2022-06-01 11:53:37 +02:00
Alexey Milovidov	31b3350749	Merge pull request #37710 from ClickHouse/fix-grouping-function Make GROUPING function skip constant folding	2022-06-01 12:00:14 +03:00
Alexey Milovidov	a0020cb55c	Merge pull request #37724 from CurtizJ/fix-ast-optimizations-remote Fix `optimize_monotonous_functions_in_order_by` in distributed queries	2022-06-01 11:54:45 +03:00
Han Fei	ea693dd0c2	add config and change test logic	2022-06-01 14:57:07 +08:00
Antonio Andelic	31e1e67836	Initialize ParallelReadBuffer after construction	2022-06-01 06:25:32 +00:00
Paul Loyd	32d267ec6c	Stop removing UTF-8 BOM in RowBinary* formats Fixes #37420	2022-06-01 13:12:55 +08:00
Anton Popov	6cf9405f09	fix optimize_monotonous_functions_in_order_by in distributed queries	2022-06-01 00:50:28 +00:00
Anton Popov	20e319d67a	Merge pull request #37666 from CurtizJ/optimize-coalesce Optimize function `COALESCE` with two arguments	2022-05-31 23:48:13 +02:00
Nikolai Kochetov	04c14e9c5d	Fix tests and add comment.	2022-05-31 20:59:50 +00:00
Alexander Gololobov	26609a1875	Style fixes	2022-05-31 21:41:10 +02:00
Nikolai Kochetov	9954c59dc1	Update test.	2022-05-31 19:40:50 +00:00
Nikolai Kochetov	86fbb74703	Merge branch 'master' into refactor-read-metrics-and-callbacks	2022-05-31 18:07:47 +00:00
Nikolai Kochetov	32010f0ba8	Add a test.	2022-05-31 17:56:48 +00:00
vdimir	284c9bc68b	Rollback some changes from appendFromBlock	2022-05-31 17:40:35 +00:00
mergify[bot]	d57d987a02	Merge branch 'master' into sql-user-defined-functions-readonly-fix	2022-05-31 16:50:00 +00:00
Maksim Kita	66f43b9ad3	Fix executable user default functions execution with Nullable arguments	2022-05-31 18:46:33 +02:00
Dmitry Novik	b11749ca2c	Make GROUPING function skip constant folding	2022-05-31 16:45:29 +00:00
Anton Kozlov	3576625647	CLICKHOUSE-2131 Add an option to disable connection pooling in ODBC bridge	2022-05-31 16:26:08 +00:00
vdimir	e7be677fca	Assert structure match up to locard in appendFromBlock	2022-05-31 16:02:58 +00:00
vdimir	7f4ddb1667	Fix assert for 02244_lowcardinality_hash_join	2022-05-31 16:02:57 +00:00
vdimir	2476c6a988	Fix error on joining with dictionary on some conditions	2022-05-31 16:02:57 +00:00
Vladimir C	2a38fdb796	Merge pull request #37653 from vdimir/cross_join_dup_col_names	2022-05-31 17:50:19 +02:00
Maksim Kita	d1a4550b4f	Fix create or drop of sql user defined functions in readonly mode	2022-05-31 17:23:41 +02:00
Alexey Milovidov	4bb04f913f	Fix clang-tidy-14	2022-05-31 17:20:07 +02:00
avogar	858570d335	Refactor docs related to format settings	2022-05-31 15:18:49 +00:00
Han Fei	5693e6212d	add config and fix style check	2022-05-31 23:18:05 +08:00
HeenaBansal2009	2584bbb7f1	Fix Style check	2022-05-31 07:49:54 -07:00
Maksim Kita	bacee7f19c	Merge pull request #37195 from kitaisreal/merging-sorted-algorithm-single-column-specialization MergingSortedAlgorithm single column specialization	2022-05-31 16:46:18 +02:00
Anton Popov	00f87b0f57	replace multiIf to if in case of one condition	2022-05-31 14:45:12 +00:00
Nikolai Kochetov	147a819221	Refactor a little bit more.	2022-05-31 14:43:38 +00:00
Dmitry Novik	b41fe00f31	Merge pull request #37542 from azat/grouping-sets-fix-optimize_aggregation_in_order Prohibit optimize_aggregation_in_order with GROUPING SETS (fixes LOGICAL_ERROR)	2022-05-31 15:31:45 +02:00
Dmitry Novik	f58623a375	Merge pull request #37593 from azat/union-type-cast-resubmit Fix converting types for UNION queries (may produce LOGICAL_ERROR)	2022-05-31 15:27:50 +02:00
mergify[bot]	ba49c6bb46	Merge branch 'master' into memory-overcommit-improvement	2022-05-31 13:17:06 +00:00
alesapin	473b0bd0db	Merge pull request #37604 from ClickHouse/turn_on_s3_tests Turn on s3 tests to red mode	2022-05-31 15:01:24 +02:00
kssenii	c2087b3145	Fix	2022-05-31 14:38:11 +02:00
Kruglov Pavel	7cc87d9a65	Merge pull request #37537 from Avogar/skip-first-lines Allow to skip some of the first lines in CSV/TSV formats	2022-05-31 14:26:21 +02:00
kssenii	69cd3a2b10	Fix	2022-05-31 14:20:31 +02:00
mergify[bot]	d85c3ec69e	Merge branch 'master' into turn_on_s3_tests	2022-05-31 11:58:16 +00:00
mergify[bot]	1e08046c47	Merge branch 'master' into cleanup_unused	2022-05-31 10:28:19 +00:00
taiyang-li	047387bf1c	fix 2 bugs: 1. select count(1) from hive_table; 2. select _file, _path from hive_table	2022-05-31 17:39:02 +08:00
mergify[bot]	f90dddccba	Merge branch 'master' into fix-temp-table-drop	2022-05-31 09:10:00 +00:00
Kseniia Sumarokova	73ed9c3977	Merge pull request #37619 from Vxider/wv-fix-table-identifier Fix bugs in WindowView when using table identifier	2022-05-31 11:07:11 +02:00
zvonand	869486cc3b	fix segfault(2)	2022-05-31 11:40:49 +03:00
Yakov Olkhovskiy	873ac9f8ff	Merge pull request #37540 from ClickHouse/feature-server-certificate showCertificate function implementation	2022-05-31 02:50:03 -04:00
xlwh	ba4cdd43bd	Cleanup unused file	2022-05-31 14:37:30 +08:00
zhanglistar	53020b096d	Merge branch 'ClickHouse:master' into typo	2022-05-31 11:28:12 +08:00
HeenaBansal2009	3976afa56a	Fix build failures	2022-05-30 20:06:27 -07:00
yaqi-zhao	a2857491c4	add avx512 support for mergetreereader	2022-05-30 20:53:00 -04:00
Yakov Olkhovskiy	c6b20cd5ed	Merge pull request #37187 from Algunenano/floating_seconds Allow decimal values in settings using seconds	2022-05-30 20:33:47 -04:00
zvonand	8aebaa7194	fix segfault	2022-05-31 01:33:44 +03:00
Anton Popov	30f8eb800a	optimize function coalesce with two arguments	2022-05-30 22:29:35 +00:00
Dmitry Novik	9d04305a5a	Update Settings.h	2022-05-30 23:00:28 +02:00
mergify[bot]	55913cf8e1	Merge branch 'master' into turn_on_s3_tests	2022-05-30 20:52:40 +00:00
Nikolai Kochetov	df0d580a8c	Fix another one test.	2022-05-30 19:29:57 +00:00
Kseniia Sumarokova	18bda56e4c	Merge pull request #37655 from ClickHouse/kssenii-patch-3-1 Fix hung check	2022-05-30 21:22:12 +02:00
mergify[bot]	b43cfd056f	Merge branch 'master' into floating_seconds	2022-05-30 19:18:35 +00:00
Nikolai Kochetov	77b07dd0a8	Merge pull request #37163 from ClickHouse/grouping-function Add GROUPING function	2022-05-30 20:45:04 +02:00
Nikolai Kochetov	913e7a91ae	Fix limits from subquery.	2022-05-30 18:25:17 +00:00
HeenaBansal2009	b7eb6bbd38	Fixed clang-tidy-CheckTriviallyCopyableMove-errors	2022-05-30 11:09:03 -07:00
Robert Schulze	ad12adc31c	Measure and rework internal re2 caching This commit is based on local benchmarks of ClickHouse's re2 caching. Question 1: ----------------------------------------------------------- Is pattern caching useful for queries with const LIKE/REGEX patterns? E.g. SELECT LIKE(col_haystack, '%HelloWorld') FROM T; The short answer is: no. Runtime is (unsurprisingly) dominated by pattern evaluation + other stuff going on in queries, but definitely not pattern compilation. For space reasons, I omit details of the local experiments. (Side note: the current caching scheme is unbounded in size which poses a DoS risk (think of multi-tenancy). This risk is more pronounced when unbounded caching is used with non-const patterns ..., see next question) Question 2: ----------------------------------------------------------- Is pattern caching useful for queries with non-const LIKE/REGEX patterns? E.g. SELECT LIKE(col_haystack, col_needle) FROM T; I benchmarked five caching strategies: 1. no caching as a baseline (= recompile for each row) 2. unbounded cache (= threadsafe global hash-map) 3. LRU cache (= threadsafe global hash-map + LRU queue) 4. lightweight local cache 1 (= not threadsafe local hashmap with collision list which grows to a certain size (here: 10 elements) and afterwards never changes) 5. lightweight local cache 2 (not threadsafe local hashmap without collision list in which a collision replaces the stored element, idea by Alexey) ... using a haystack of 2 mio strings and A). 2 mio distinct simple patterns B). 10 simple patterns C) 2 mio distinct complex patterns D) 10 complex patterns Fo A) and C), caching does not help but these queries still allow to judge the static overhead of caching on query runtimes. B) and D) are extreme but common cases in practice. They include queries like "SELECT ... WHERE LIKE (col_haystack, flag ? '%pattern1%' : '%pattern2%'). Caching should help significantly. Because LIKE patterns are internally translated to re2 expressions, I show only measurements for MATCH queries. Results in sec, averaged over on multiple measurements; 1.A): 2.12 B): 1.68 C): 9.75 D): 9.45 2.A): 2.17 B): 1.73 C): 9.78 D): 9.47 3.A): 9.8 B): 0.63 C): 31.8 D): 0.98 4.A): 2.14 B): 0.29 C): 9.82 D): 0.41 5.A) 2.12 / 2.15 / 2.26 B) 1.51 / 0.43 / 0.30 C) 9.97 / 9.88 / 10.13 D) 5.70 / 0.42 / 0.43 (10/100/1000 buckets, resp. 10/1/0.1% collision rate) Evaluation: 1. This is the baseline. It was surprised that complex patterns (C, D) slow down the queries so badly compared to simple patterns (A, B). The runtime includes evaluation costs, but as caching only helps with compilation, and looking at 4.D and 5.D, compilation makes up over 90% of the runtime! 2. No speedup compared to 1, probably due to locking overhead. The cache is unbounded, and in experiments with data sets > 2 mio rows, 2. is the only scheme to throw OOM exceptions which is not acceptable. 3. Unique patterns (A and C) lead to thrashing of the LRU cache and very bad runtimes due to LRU queue maintenance and locking. Works pretty well however with few distinct patterns (B and D). 4. This scheme is tailored to queries B and D where it performs pretty good. More importantly, the caching is lightweight enough to not deteriorate performance on datasets A and C. 5. After some tuning of the hash map size, 100 buckets seem optimal to be in the same ballpark with 10 distinct patterns as 4. Performance also does not deteriorate on A and C compared to the baseline. Unlike 4., this scheme behaves LRU-like and can adjust to changing pattern distributions. As a conclusion, this commit implementes two things: 1. Based on Q1, pattern search with const needle no longer uses caching. This applies to LIKE and MATCH + a few (exotic) other SQL functions. The code for the unbounded caching was removed. 2. Based on Q2, pattern search with non-const needles now use method 5.	2022-05-30 20:00:35 +02:00
Alexander Gololobov	e2dd6f6249	Removed prewhere_info.alias_actions	2022-05-30 19:58:23 +02:00
Han Fei	e15cdec39c	address comments	2022-05-31 01:46:31 +08:00
Anton Popov	52d3791eb9	Merge pull request #37600 from CurtizJ/fix-with-fill-interval Fix `WITH FILL` with negative intervals in `STEP` clause	2022-05-30 19:43:12 +02:00
alesapin	6db44f633f	Merge pull request #37641 from azat/keeper-list-watches keeper: store only unique session IDs for watches (should fix SIGKILL in stress tests)	2022-05-30 18:55:52 +02:00
Fred Wulff	537050ef8e	Add support for DeleteObject when chunk_size == 0	2022-05-30 16:54:44 +00:00
mergify[bot]	d4e722bbfa	Merge branch 'master' into http-named-collection	2022-05-30 16:40:18 +00:00
Han Fei	af86900c52	Merge branch 'hanfei/zk-write' of github.com:hanfei1991/ClickHouse into hanfei/zk-write	2022-05-31 00:17:38 +08:00
Han Fei	42fca8d5f0	address comments	2022-05-31 00:17:32 +08:00
Han Fei	a464b10afe	Update src/Storages/System/StorageSystemZooKeeper.cpp Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>	2022-05-30 23:52:37 +08:00
Han Fei	194445646a	Update src/Storages/System/StorageSystemZooKeeper.cpp Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>	2022-05-30 23:47:43 +08:00
vdimir	8a3f4bda62	Fix columns number mismatch in cross join	2022-05-30 15:40:15 +00:00
Kseniia Sumarokova	b1ba7b7027	Fix build	2022-05-30 17:30:59 +02:00
Kseniia Sumarokova	1869adfd7d	Update FileCache.cpp	2022-05-30 16:06:58 +02:00
Nikolai Kochetov	c71256ea38	Remove some commented code.	2022-05-30 13:18:20 +00:00
Nikolai Kochetov	5ef51ed27b	Fix more tests.	2022-05-30 13:10:30 +00:00
Amos Bird	b68e8efaf0	Fix joinGet with cannot be null type	2022-05-30 21:01:27 +08:00
Alexander Tokmakov	351956d108	Merge pull request #37640 from azat/transaction-fix Fix excessive LIST requests to coordinator for transactions	2022-05-30 15:46:52 +03:00
Kruglov Pavel	0615866aea	Merge pull request #37450 from Avogar/check-format-on-storage-creation Check format name on storage creation	2022-05-30 14:23:20 +02:00
avogar	139a7e19a9	Fix comments	2022-05-30 11:43:29 +00:00
alesapin	87baabb1a8	Followup fix	2022-05-30 13:34:42 +02:00
alesapin	362fa745e6	Ignore broken metadata	2022-05-30 13:32:12 +02:00
Nikolai Kochetov	5b4658aa5e	Merge branch 'master' into refactor-read-metrics-and-callbacks	2022-05-30 09:47:35 +00:00
Amos Bird	6525bfc4cd	Avoid context copy for InterpreterSelects	2022-05-30 17:08:12 +08:00
Azat Khuzhin	d86181d3cd	keeper: store only unique session IDs for watches This should speed up keeper, especially in case of incorrect usage (like the case that had been fixed in #37640), especially in case on non release build. And also this should fix SIGKILL in stress tests. You will find some details for one of such SIGKILL in `<details>` tag [1]: <details> $ pigz -cd clickhouse-server.stress.log.gz \| tail 2022.05.27 16:17:24.882971 [ 637 ] {} <Trace> BackgroundSchedulePool/BgSchPool: Waiting for threads to finish. 2022.05.27 16:17:24.896749 [ 637 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 4.09 MiB. 2022.05.27 16:17:24.907163 [ 637 ] {} <Debug> Application: Shut down storages. 2022.05.27 16:17:24.907233 [ 637 ] {} <Debug> Application: Waiting for current connections to servers for tables to finish. 2022.05.27 16:17:24.934335 [ 637 ] {} <Information> Application: Closed all listening sockets. Waiting for 1 outstanding connections. 2022.05.27 16:17:29.843491 [ 637 ] {} <Information> Application: Closed connections to servers for tables. But 1 remain. Probably some tables of other users cannot finish their connections after context shutdown. 2022.05.27 16:17:29.843632 [ 637 ] {} <Debug> KeeperDispatcher: Shutting down storage dispatcher 2022.05.27 16:17:34.612616 [ 688 ] {} <Test> virtual Coordination::ZooKeeperRequest::~ZooKeeperRequest(): Processing of request xid=2147483647 took 10000 ms 2022.05.27 16:17:54.612109 [ 3176 ] {} <Debug> KeeperTCPHandler: Session #12 expired 2022.05.27 16:19:59.823038 [ 635 ] {} <Fatal> Application: Child process was terminated by signal 9 (KILL). If it is not done by 'forcestop' command or manually, the possible cause is OOM Killer (see 'dmesg' and look at the '/var/log/kern.log' for the details). Thread 26 (Thread 0x7f1c7703f700 (LWP 708)): 0 0x000000000b074b2a in __tsan::MemoryAccessImpl(__tsan::ThreadState, unsigned long, int, bool, bool, unsigned long long, __tsan::Shadow) () 1 0x000000000b08630c in __tsan::MemoryAccessRange(__tsan::ThreadState, unsigned long, unsigned long, unsigned long, bool) () 2 0x000000000b01ff03 in memmove () 3 0x000000001bbc8996 in std::__1::__move<long, long> (__first=0xb8600000d83304, __last=<optimized out>, __result=0x7f1c021cd000) at ../contrib/libcxx/include/__algorithm/move.h:57 4 std::__1::move<long, long> (__first=0xb8600000d83304, __last=<optimized out>, __result=0x7f1c021cd000) at ../contrib/libcxx/include/__algorithm/move.h:70 5 std::__1::vector<long, std::__1::allocator<long> >::erase (this=0x7b1400584c48, __position=...) at ../contrib/libcxx/include/vector:1608 6 DB::KeeperStorage::clearDeadWatches (this=0x7b5800001ad8, this@entry=0x7b5800001800, session_id=session_id@entry=12) at ../src/Coordination/KeeperStorage.cpp:1228 7 0x000000001bbc5c55 in DB::KeeperStorage::processRequest (this=0x7b5800001800, zk_request=..., session_id=12, time=1, new_last_zxid=..., check_acl=true) at ../src/Coordination/KeeperStorage.cpp:1122 8 0x000000001bba06a3 in DB::KeeperStateMachine::commit (this=<optimized out>, log_idx=3549, data=...) at ../src/Coordination/KeeperStateMachine.cpp:143 9 0x000000001bba6193 in nuraft::state_machine::commit_ext (this=0x7b4c00001f98, params=...) at ../contrib/NuRaft/include/libnuraft/state_machine.hxx:75 10 0x00000000202c5a55 in nuraft::raft_server::commit_app_log (this=this@entry=0x7b6c00002a18, idx_to_commit=idx_to_commit@entry=3549, le=..., need_to_handle_commit_elem=true, initial_commit_exec=false) at ../contrib/NuRaft/src/handle_commit.cxx:311 11 0x00000000202c4f98 in nuraft::raft_server::commit_in_bg_exec (this=<optimized out>, this@entry=0x7b6c00002a18, timeout_ms=timeout_ms@entry=0, initial_commit_exec=false) at ../contrib/NuRaft/src/handle_commit.cxx:241 12 0x00000000202c4613 in nuraft::raft_server::commit_in_bg (this=this@entry=0x7b6c00002a18) at ../contrib/NuRaft/src/handle_commit.cxx:149 ... Thread 28 (Thread 0x7f1c7603d700 (LWP 710)): 0 0x00007f1d22a6d110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 1 0x00007f1d22a650a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 2 0x000000000b0337b0 in pthread_mutex_lock () 3 0x00000000221884da in std::__1::__libcpp_mutex_lock (__m=0x7b4c00002088) at ../contrib/libcxx/include/__threading_support:303 4 std::__1::mutex::lock (this=0x7b4c00002088) at ../contrib/libcxx/src/mutex.cpp:33 5 0x000000001bba4188 in std::__1::lock_guard<std::__1::mutex>::lock_guard (__m=..., this=<optimized out>) at ../contrib/libcxx/include/__mutex_base:91 6 DB::KeeperStateMachine::getDeadSessions (this=0x7b4c00001f98) at ../src/Coordination/KeeperStateMachine.cpp:360 7 0x000000001bb79b4b in DB::KeeperServer::getDeadSessions (this=0x7b4400012700) at ../src/Coordination/KeeperServer.cpp:572 8 0x000000001bb64d1a in DB::KeeperDispatcher::sessionCleanerTask (this=<optimized out>, this@entry=0x7b640001c218) at ../src/Coordination/KeeperDispatcher.cpp:399 ... Thread 1 (Thread 0x7f1d227148c0 (LWP 637)): 0 0x00007f1d22a69376 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 1 0x000000000b0895e0 in __tsan::call_pthread_cancel_with_cleanup(int ()(void), void ()(void), void) () 2 0x000000000b017091 in pthread_cond_wait () 3 0x0000000020569d98 in Poco::EventImpl::waitImpl (this=0x7b2000008798) at ../contrib/poco/Foundation/src/Event_POSIX.cpp:106 4 0x000000001bb636cf in Poco::Event::wait (this=0x7b2000008798) at ../contrib/poco/Foundation/include/Poco/Event.h:97 5 ThreadFromGlobalPool::join (this=<optimized out>) at ../src/Common/ThreadPool.h:217 6 DB::KeeperDispatcher::shutdown (this=0x7b640001c218) at ../src/Coordination/KeeperDispatcher.cpp:322 7 0x0000000019ca8bfc in DB::Context::shutdownKeeperDispatcher (this=<optimized out>) at ../src/Interpreters/Context.cpp:2111 8 0x000000000b0a979b in DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&)::$_9::operator()() const (this=0x7ffcde44f0a0) at ../programs/server/Server.cpp:1407 </details> [1]: https://s3.amazonaws.com/clickhouse-test-reports/37593/2613149f6bf4f242bbbf2c3c8539b5176fd77286/stress_test__thread__actions_.html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-30 11:50:48 +03:00
Azat Khuzhin	78bd47d8df	Fix excessive LIST requests to coordinator for transactions In [1] there was only few transactions, but lots of List for /test/clickhouse/txn/log: $ clickhouse-local --format TSVWithNamesAndTypes --file zookeeper_log.tsv.gz -q "select * except('path\|session_id\|event_time\|thread_id\|event_date\|xid') apply(x->groupUniqArray(x)), path, session_id, min(event_time), max(event_time), count() from table where has_watch and type = 'Request' group by path, session_id order by count() desc limit 1 format Vertical" Row 1: ────── groupUniqArray(type): ['Request'] groupUniqArray(query_id): ['','62d75128-9031-48a5-87ba-aec3f0b591c6'] groupUniqArray(address): ['::1'] groupUniqArray(port): [9181] groupUniqArray(has_watch): [1] groupUniqArray(op_num): ['List'] groupUniqArray(data): [''] groupUniqArray(is_ephemeral): [0] groupUniqArray(is_sequential): [0] groupUniqArray(version): [] groupUniqArray(requests_size): [0] groupUniqArray(request_idx): [0] groupUniqArray(error): [] groupUniqArray(watch_type): [] groupUniqArray(watch_state): [] groupUniqArray(stat_version): [0] groupUniqArray(stat_cversion): [0] groupUniqArray(stat_dataLength): [0] groupUniqArray(stat_numChildren): [0] groupUniqArray(children): [[]] path: /test/clickhouse/txn/log session_id: 1 min(event_time): 2022-05-27 12:54:09.025897 max(event_time): 2022-05-27 13:37:12.846314 <!-- last transaction was at 12:54, see server log count(): 3673675 <-- huge [1]: https://s3.amazonaws.com/clickhouse-test-reports/37593/2613149f6bf4f242bbbf2c3c8539b5176fd77286/stateless_tests__debug__actions__[1/3].html Server log: $ pigz -cd clickhouse-server.log.gz \| fgrep TransactionLog: \| head -1 2022.05.27 12:54:09.026852 [ 5018 ] {62d75128-9031-48a5-87ba-aec3f0b591c6} <Trace> TransactionLog: Loading 33 entries from /test/clickhouse/txn/log: csn-0000000000..csn-0000000032 $ pigz -cd clickhouse-server.log.gz \| fgrep TransactionLog: \| tail -1 2022.05.27 12:54:58.909222 [ 509 ] {} <Test> TransactionLog: Closing readonly transaction (177, 38, 41b51ff1-bcba-43bf-bcea-e97ad05f6040) $ pigz -cd clickhouse-server.log.gz \| fgrep 62d75128-9031-48a5-87ba-aec3f0b591c6 \| tail -1 2022.05.27 12:54:09.064857 [ 5018 ] {62d75128-9031-48a5-87ba-aec3f0b591c6} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B. Fixes: #37398 (cc @tavplubix) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-30 11:39:34 +03:00
flynn	9ffa5f5e0d	fix typo	2022-05-30 08:10:16 +00:00
Alexey Milovidov	1d6f9de001	Merge branch 'llvm-14' of github.com:ClickHouse/ClickHouse into llvm-14	2022-05-30 05:36:43 +02:00
Alexey Milovidov	f1fb57c6ce	Fix clang-tidy-14	2022-05-30 05:36:26 +02:00
Alexey Milovidov	9a5dd75a68	Merge branch 'master' into llvm-14	2022-05-30 05:34:38 +02:00
Alexey Milovidov	c0e6ff4216	More precise result of "dumpColumnStructure" and "byteSize" miscellaneous functions	2022-05-30 04:56:54 +02:00
taiyang-li	dbb8a09825	merge master and solve conflict	2022-05-30 10:47:04 +08:00
taiyang-li	0b0d38b18c	Merge branch 'master' into fix_settings	2022-05-30 10:07:09 +08:00
taiyang-li	51a893c8be	add some metrics	2022-05-30 10:05:20 +08:00
Andrey Zvonov	55a9b99cb4	style fix(2)	2022-05-30 02:48:28 +03:00
Andrey Zvonov	c79d87e629	fix style	2022-05-30 02:26:02 +03:00
alesapin	6f35c28592	Fix style	2022-05-30 00:29:30 +02:00
alesapin	c32b6076fb	Remove stranges from code	2022-05-30 00:12:33 +02:00
Alexey Milovidov	5fda199dcf	Update src/Common/ErrorCodes.cpp Co-authored-by: Alexander Tokmakov <tavplubix@clickhouse.com>	2022-05-30 00:20:49 +03:00
alesapin	0e8ab36913	Merge branch 'master' into turn_on_s3_tests	2022-05-29 14:37:10 +02:00
alesapin	d2cdbf3956	Fix refactoring issue	2022-05-29 14:09:49 +02:00
alesapin	9dc81e5cc8	Merge pull request #37598 from ClickHouse/revert-37545-revert-37424-fix_fetching_part_deadlock Revert "Revert "(only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part""	2022-05-29 13:54:34 +02:00
Andrey Zvonov	2dbbf14de5	Merge branch 'master' into non-neg-deriv	2022-05-29 10:09:51 +03:00
Alexander Tokmakov	579b0e3323	Merge pull request #37627 from ClickHouse/revert-37416-fix_ReplicatedMergeTree_comments Revert "Implemented changing comment to a ReplicatedMergeTree table"	2022-05-29 09:57:12 +03:00
Alexey Milovidov	9e3242f186	Merge pull request #37617 from CurtizJ/aggregation-sparse-columns Better performance with sparse columns in aggregate functions	2022-05-29 09:36:07 +03:00
Alexander Tokmakov	562eec591e	Revert "Implemented changing comment to a ReplicatedMergeTree table"	2022-05-29 09:28:47 +03:00
Alexey Milovidov	97606c324c	Merge pull request #37574 from azat/mt-tiny-refactor Remove unused MergeTreeDataMergerMutator::chooseMergeAlgorithm()	2022-05-29 07:59:57 +03:00
zvonand	18e74eb540	remove old files	2022-05-29 03:48:59 +03:00
zvonand	295a0f9ec2	added tests	2022-05-29 03:38:42 +03:00
Alexey Milovidov	c1169019d2	Merge branch 'master' into llvm-14	2022-05-29 02:29:02 +02:00
Alexey Milovidov	11788c8129	Fix clang-tidy-14	2022-05-29 02:28:46 +02:00
zvonand	032e54abbf	works now	2022-05-29 03:21:07 +03:00
Alexey Milovidov	73e2e63414	Merge pull request #37612 from ClickHouse/clang-tidy-14 Fix clang-tidy-14, part 1	2022-05-29 03:16:32 +03:00
Alexey Milovidov	4e60c88a27	Merge pull request #37609 from DevTeamBK/Medium-Clang-Tidy-Fix Fix Clang-Tidy: remove std::move() from trivially-copyable object	2022-05-28 21:27:08 +03:00
Alexander Tokmakov	4e52f45695	Merge branch 'master' into fix_trash	2022-05-28 19:43:19 +02:00
Anton Popov	c39d95e2e6	add perf test	2022-05-28 12:56:38 +00:00
Anton Popov	1d9b3be7da	Merge pull request #37536 from CurtizJ/profile-events-for-part-types Add profile events for introspection of part types	2022-05-28 14:25:21 +02:00
Kseniia Sumarokova	8be351717f	Merge pull request #37606 from ClickHouse/kssenii-patch-3 Update FileCache.cpp	2022-05-28 12:48:45 +02:00
Han Fei	340a264a62	fix style	2022-05-28 18:26:14 +08:00
Han Fei	0a0d77bdef	fix build	2022-05-28 17:57:59 +08:00
Han Fei	0f71231574	try fix flaky tests and refine code style	2022-05-28 17:25:33 +08:00
Vxider	b24346328d	fix parser when using table identifer	2022-05-28 08:22:34 +00:00
Anton Popov	b2cff26ecf	better performace with sparse columns in aggregate functions	2022-05-28 02:22:20 +00:00
Alexey Milovidov	eff285e24a	Update ThreadFuzzer.cpp	2022-05-28 05:13:16 +03:00
Alexander Gololobov	6a57e1a970	Merge pull request #37601 from ClickHouse/array_norm_dist_fixes Added LpNorm and LpDistance functions for arrays	2022-05-27 23:40:38 +02:00
Alexey Milovidov	d6597efc08	Fix clang-tidy-14, part 1	2022-05-27 23:03:30 +02:00
Alexey Milovidov	be07c4c4b0	Fix clang-tidy-14, part 1	2022-05-27 23:03:16 +02:00
Alexey Milovidov	d62c57be3f	Fix clang-tidy-14, part 1	2022-05-27 23:02:25 +02:00
Alexey Milovidov	8e9d771237	Fix clang-tidy-14, part 1	2022-05-27 23:02:05 +02:00
Alexey Milovidov	6c2699a991	Fix clang-tidy-14, part 1	2022-05-27 23:00:45 +02:00
Alexey Milovidov	3086c19341	Fix clang-tidy-14, part 1	2022-05-27 23:00:23 +02:00
Alexey Milovidov	c50791dd3b	Fix clang-tidy-14, part 1	2022-05-27 22:52:14 +02:00
Alexey Milovidov	d2c6fd90cb	Fix clang-tidy-14, part 1	2022-05-27 22:51:37 +02:00
Kseniia Sumarokova	10c9716467	Fix clang-tidy	2022-05-27 22:48:07 +02:00
Nikolai Kochetov	b80b1940ce	Fix some tests.	2022-05-27 20:47:35 +00:00
Alexander Tokmakov	7e659036f8	fix	2022-05-27 20:30:06 +02:00
HeenaBansal2009	a061acadbe	Remove std::move from trivially-copyable object	2022-05-27 11:04:29 -07:00
Yakov Olkhovskiy	41ef0044f0	endpoint is added	2022-05-27 13:43:34 -04:00
mergify[bot]	f5ee337bab	Merge branch 'master' into revert-37545-revert-37424-fix_fetching_part_deadlock	2022-05-27 16:52:00 +00:00
mergify[bot]	923ad2e905	Merge branch 'master' into turn_on_s3_tests	2022-05-27 16:31:43 +00:00
Dmitry Novik	60b9d81773	Remove global_memory_usage_overcommit_max_wait_microseconds	2022-05-27 16:30:29 +00:00
alesapin	f63fa9bcc6	Merge pull request #37416 from Enmk/fix_ReplicatedMergeTree_comments Implemented changing comment to a ReplicatedMergeTree table	2022-05-27 18:29:34 +02:00
Alexander Gololobov	9b1b30855c	Fixed check for HUGE_VAL	2022-05-27 18:25:11 +02:00
Alexander Gololobov	6361c5f38c	Fix for failed style check	2022-05-27 18:22:16 +02:00
Kseniia Sumarokova	8099361cbc	Update FileCache.cpp	2022-05-27 17:48:14 +02:00
Alexander Gololobov	540353566c	Added LpNorm and LpDistance functions for arrays	2022-05-27 17:17:08 +02:00
Azat Khuzhin	8a224239c1	Prohibit optimize_aggregation_in_order with GROUPING SETS AggregatingStep ignores it anyway, and it leads to the following error in getSortDescriptionFromGroupBy(), like in [1]: 2022.05.24 04:29:29.279431 [ 3395 ] {26543564-8bc8-4a3a-b984-70a2adf0245d} <Fatal> : Logical error: 'Trying to get name of not a column: ExpressionList'. [1]: https://s3.amazonaws.com/clickhouse-test-reports/36914/67d3ac72d26ab74d69f03c03422349d4faae9e19/stateless_tests__ubsan__actions_.html v2: revert change to getSortDescriptionFromGroupBy() after GroupingSetsRewriterVisitor had been introduced Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-27 17:44:57 +03:00
Azat Khuzhin	1f29b0a901	Rewrite queries GROUPING SETS (foo, bar) to GROUP BY foo, bar This is better then introducing separate SelectQueryExpressionAnalyzer::useGroupingSetKey(), since for optimize_aggregation_in_order that method will not be enough, because size of ManyExpressionActions will not match size of SortDescription, in ReadInOrderOptimizer::ReadInOrderOptimizer() And plus it is cleaner. v2: fix clang-tidy Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-27 17:44:51 +03:00
alesapin	5a296aec01	Fix build	2022-05-27 16:34:16 +02:00
alesapin	be1c3c132b	Fix some trash	2022-05-27 16:08:49 +02:00
Anton Popov	abc90fad8d	fix WITH FILL with negative itervals	2022-05-27 12:42:51 +00:00
zvonand	5c558d0be9	old work upload	2022-05-27 15:07:22 +03:00
alesapin	6d6779f17a	Merge pull request #37139 from ClickHouse/i_object_storage Separate object storage operations from disks	2022-05-27 13:59:50 +02:00
alesapin	c79600c4c8	Fix build	2022-05-27 13:44:29 +02:00
taiyang-li	73d2c889c6	fix log level	2022-05-27 19:23:58 +08:00
alesapin	841858ec30	Revert "Revert "(only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part""	2022-05-27 13:13:36 +02:00
Azat Khuzhin	2613149f6b	Fix converting types for UNION queries (may produce LOGICAL_ERROR) CI founds [1]: 2022.02.20 15:14:23.969247 [ 492 ] {} <Fatal> BaseDaemon: (version 22.3.1.1, build id: 6082C357CFA6FF99) (from thread 472) (query_id: a5187ff9-962a-4e7c-86f6-8d48850a47d6) (query: SELECT 0., round(avgWeighted(x, y)) FROM (SELECT toDate(toDate('214748364.8', '-922337203.6854775808', '-0.1', NULL) - NULL, 10.000100135803223, '-2147483647'), 255 AS x, -2147483647 AS y UNION ALL SELECT y, NULL AS x, 2147483646 AS y)) Received signal Aborted (6) [1]: https://s3.amazonaws.com/clickhouse-test-reports/0/26d0e5438c86e52a145aaaf4cb523c399989a878/fuzzer_astfuzzerdebug,actions//report.html The problem is that subqueries returns different headers: - first query -- x, y - second query -- y, x v2: Make order of columns strict only for UNION https://s3.amazonaws.com/clickhouse-test-reports/34775/9cc8c01a463d18c471853568b2f0af659a4e643f/stateless_tests__address__actions__[2/2].html Fixes: 00597_push_down_predicate_long v3: add no-backward-compatibility-check for the test Fixes: #37569 Resubmit: #34775 Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> (cherry picked from commit `a813f5996e`)	2022-05-27 14:11:57 +03:00
Han Fei	2ea027ffcb	Support insert into system.zookeeper	2022-05-27 18:53:12 +08:00
taiyang-li	ea450b86cb	add some prefetch metric codes	2022-05-27 18:06:40 +08:00
Kseniia Sumarokova	141334448e	Merge pull request #37566 from kssenii/fix-assertion Fix failed assertion in cache	2022-05-27 11:59:03 +02:00
Kseniia Sumarokova	2943d44bf1	Merge pull request #37554 from msaf1980/cleanup_hdfs Cleanup StorageHDFS (unused variables prevent build with clang 12)	2022-05-27 11:57:18 +02:00
Kseniia Sumarokova	f5d69506b4	Merge pull request #37516 from KinderRiven/improve_local_cache Control cache downloads to avoid negative optimization of local caches	2022-05-27 11:53:17 +02:00
zhanglistar	ca67e67a74	Fix a typo	2022-05-27 15:52:04 +08:00
Robert Schulze	80061aa3e2	Merge remote-tracking branch 'origin/master' into cached_patterns	2022-05-27 09:21:01 +02:00
Vxider	54d6f98122	flush and shutdown temporary table before drop	2022-05-27 04:50:36 +00:00
Dmitry Novik	3a9239b79f	Revert "RFC: Fix converting types for UNION queries (may produce LOGICAL_ERROR)"	2022-05-27 04:05:32 +02:00
Yakov Olkhovskiy	25884c68f1	http named collection source implemented for dictionary	2022-05-26 20:46:26 -04:00
Alexey Milovidov	8ba865bb60	Merge pull request #37344 from excitoon-favorites/fixs3colonandequalssign Fixed error with symbols in key name in S3	2022-05-27 00:58:35 +03:00
Alexey Milovidov	86afa3a245	Merge pull request #37502 from ClickHouse/array_norm_dist_fixes Renamed arrayXXNorm/arrayXXDistance functions to XXNorm/XXDistance and fixed some overflow cases	2022-05-27 00:56:29 +03:00
Alexander Gololobov	e655863d53	Merge pull request #37528 from kitaisreal/normalize-utf8-performance-tests-fix Functions normalizeUTF8 unstable performance tests fix	2022-05-26 20:49:10 +02:00
Azat Khuzhin	c6c60364ae	Remove unused MergeTreeDataMergerMutator::chooseMergeAlgorithm() In favor of MergeTask::ExecuteAndFinalizeHorizontalPart::chooseMergeAlgorithm() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-26 21:20:50 +03:00
Alexander Tokmakov	eb71dd4c78	Merge pull request #37547 from ClickHouse/followup_37398 Follow-up to #37398	2022-05-26 20:29:41 +03:00
Azat Khuzhin	73a99d4eee	Improve error message for skipped/expired columns Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-26 20:14:10 +03:00
Azat Khuzhin	f0cac5417a	More optional skipping of fully expired columns (by TTL) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-26 20:14:10 +03:00
Azat Khuzhin	46a94f395d	Move the check into IMergedBlockOutputStream::removeEmptyColumnsFromPart() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-26 20:14:10 +03:00
Azat Khuzhin	4288d09a85	Do not write expired columns by TTL after merge w/o TTL Usually second merge do not perform TTL, since everything is up to date, however in this case TTLTransform is not used, and hence expired_columns will not be filled for new part, and so those columns will be written with default values. Avoid this, by manually filling expired_columns. Here is a simpler reproducer: Simple reproducer: ```sql create table ttl_02262 (date Date, key Int, value String TTL date + interval 1 month) engine=MergeTree order by key settings min_bytes_for_wide_part=0, min_rows_for_wide_part=0; insert into ttl_02262 values ('2010-01-01', 2010, 'foo'); ``` ```sh # ls -l .server/data/default/ttl_02262/all_* .server/data/default/ttl_02262/all_1_1_0: total 48 -rw-r----- 1 root root 335 May 26 14:19 checksums.txt -rw-r----- 1 root root 76 May 26 14:19 columns.txt -rw-r----- 1 root root 1 May 26 14:19 count.txt -rw-r----- 1 root root 28 May 26 14:19 date.bin -rw-r----- 1 root root 48 May 26 14:19 date.mrk2 -rw-r----- 1 root root 10 May 26 14:19 default_compression_codec.txt -rw-r----- 1 root root 30 May 26 14:19 key.bin -rw-r----- 1 root root 48 May 26 14:19 key.mrk2 -rw-r----- 1 root root 8 May 26 14:19 primary.idx -rw-r----- 1 root root 99 May 26 14:19 ttl.txt -rw-r----- 1 root root 30 May 26 14:19 value.bin -rw-r----- 1 root root 48 May 26 14:19 value.mrk2 ``` ```sql optimize table ttl_02262 final; ``` ```sh .server/data/default/ttl_02262/all_1_1_1: total 40 -rw-r----- 1 root root 279 May 26 14:19 checksums.txt -rw-r----- 1 root root 61 May 26 14:19 columns.txt -rw-r----- 1 root root 1 May 26 14:19 count.txt -rw-r----- 1 root root 28 May 26 14:19 date.bin -rw-r----- 1 root root 48 May 26 14:19 date.mrk2 -rw-r----- 1 root root 10 May 26 14:19 default_compression_codec.txt -rw-r----- 1 root root 30 May 26 14:19 key.bin -rw-r----- 1 root root 48 May 26 14:19 key.mrk2 -rw-r----- 1 root root 8 May 26 14:19 primary.idx -rw-r----- 1 root root 81 May 26 14:19 ttl.txt ``` ```sql optimize table ttl_02262 final; ``` ```sh .server/data/default/ttl_02262/all_1_1_2: total 48 -rw-r----- 1 root root 349 May 26 14:20 checksums.txt -rw-r----- 1 root root 76 May 26 14:20 columns.txt -rw-r----- 1 root root 1 May 26 14:20 count.txt -rw-r----- 1 root root 28 May 26 14:20 date.bin -rw-r----- 1 root root 48 May 26 14:20 date.mrk2 -rw-r----- 1 root root 10 May 26 14:20 default_compression_codec.txt -rw-r----- 1 root root 30 May 26 14:20 key.bin -rw-r----- 1 root root 48 May 26 14:20 key.mrk2 -rw-r----- 1 root root 8 May 26 14:20 primary.idx -rw-r----- 1 root root 81 May 26 14:20 ttl.txt -rw-r----- 1 root root 27 May 26 14:20 value.bin -rw-r----- 1 root root 48 May 26 14:20 value.mrk2 ``` And now we have `value.*` for all_1_1_2, this should not happen. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-26 20:14:10 +03:00
Azat Khuzhin	8328d7068b	Fix updating of MergeTreeDataPartTTLInfo::finished Previously you cannot distinguish non-initialized finished with initialized to false, so update() cannot do the correct thing. Rename the field to avoid hidden usage. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-26 20:14:10 +03:00
Azat Khuzhin	0de1a64436	Log empty parts in IMergedBlockOutputStream::removeEmptyColumnsFromPart() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-26 20:14:09 +03:00
Nikolai Kochetov	ec4e8d71b2	Fixing build	2022-05-26 15:33:21 +00:00
Dmitry Novik	673a521d0b	Merge pull request #34775 from azat/union-type-cast RFC: Fix converting types for UNION queries (may produce LOGICAL_ERROR)	2022-05-26 17:28:23 +02:00
kssenii	36af6b1fa8	Fix assertion	2022-05-26 16:15:02 +02:00
alesapin	c862f89b8d	Fix tidy	2022-05-26 15:43:21 +02:00
Alexander Tokmakov	e8f33fb0d9	fix flaky tests	2022-05-26 14:17:05 +02:00
Azat Khuzhin	dc9ca3d70c	Fix LOGICAL_ERROR in getMaxSourcePartsSizeForMerge during merges (#37413 )	2022-05-26 14:14:58 +02:00
Nikolai Kochetov	bf95541531	Fixing style.	2022-05-26 11:09:36 +00:00
Nikolai Kochetov	84f97b53de	Merge branch 'master' into refactor-read-metrics-and-callbacks	2022-05-26 11:07:45 +00:00
Nikolai Kochetov	fea2401f1f	Merge pull request #37532 from ClickHouse/add-separate-mutex-for-factories-info Use a separate mutex for query_factories_info in Context.	2022-05-26 13:03:28 +02:00
mergify[bot]	a7629f900f	Merge branch 'master' into normalize-utf8-performance-tests-fix	2022-05-26 10:29:55 +00:00
Maksim Kita	3a92e61827	Merge pull request #37148 from kitaisreal/dictionary-get-descendants-performance-improvement Dictionary getDescendants performance improvement	2022-05-26 12:29:17 +02:00
KinderRiven	824628c0da	fix style	2022-05-26 16:51:16 +08:00
KinderRiven	822ecd982f	better & support clean stash	2022-05-26 16:36:05 +08:00
Vasily Nemkov	abe6b5d013	Reverted unnecessary modification	2022-05-26 10:09:27 +03:00
Antonio Andelic	fe236c98d5	Merge pull request #37534 from ClickHouse/revert-37036-keeper-preprocess-operations Revert "Add support for preprocessing ZooKeeper operations in `clickhouse-keeper`"	2022-05-26 08:14:46 +02:00
taiyang-li	561c87222d	add prefetch for hive text	2022-05-26 11:04:35 +08:00
Yakov Olkhovskiy	2dc160a4c3	style fix	2022-05-25 20:56:36 -04:00
Dmitry Novik	5c3c994d2a	Merge pull request #37493 from ClickHouse/grouping-sets-optimization-fix Fix ORDER BY optimization in case of GROUPING SETS	2022-05-26 02:25:02 +02:00
Anton Popov	f488efd27e	fix tests	2022-05-26 00:03:31 +00:00
Dmitry Novik	16c6b60703	Introduce AggregationKeysInfo	2022-05-25 23:22:29 +00:00
alesapin	8f1aac0ce4	Fix merge with master	2022-05-26 00:44:45 +02:00
Alexey Milovidov	f321925032	Merge pull request #36341 from ClickHouse/allow-setuid-inside-clickhouse Allow to drop privileges at startup	2022-05-26 01:07:04 +03:00
Dmitry Novik	7cd7782e4f	Process columns more efficiently in GROUPING()	2022-05-25 21:55:41 +00:00
Dmitry Novik	3c1b6609ae	Add comments and make tests more verbose	2022-05-25 21:23:35 +00:00
mergify[bot]	49cce189e3	Merge branch 'master' into floating_seconds	2022-05-25 21:08:55 +00:00
alesapin	1db9cf480b	Merge remote-tracking branch 'origin/master' into i_object_storage	2022-05-25 22:50:22 +02:00
Maksim Kita	58cd1bd3ec	Merge pull request #36843 from bharatnc/ncb/h3-unidirectionaledges-funcs add h3 unidirectional edge functions	2022-05-25 22:46:40 +02:00
Maksim Kita	bee3c30f66	Merge pull request #37524 from kitaisreal/geo-distance-functions-improve-performance Geo distance functions improve performance	2022-05-25 22:40:40 +02:00
Maksim Kita	b12b363158	Fixed build of hierarchical index for HashedArrayDictionary	2022-05-25 22:40:19 +02:00
Alexander Gololobov	168b47d0ad	Use same norm and distance function names for tuples and arrays	2022-05-25 22:39:59 +02:00
Alexander Gololobov	b065839f44	always return Float64	2022-05-25 22:27:00 +02:00
Alexander Gololobov	5df14cd956	Cast arguments to result type to avoid int overflow	2022-05-25 22:27:00 +02:00
Alexander Tokmakov	3d259fa0de	Update Context.cpp	2022-05-25 23:18:07 +03:00
Alexander Tokmakov	47820c216d	Revert "(only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part"	2022-05-25 23:10:33 +03:00
Robert Schulze	49934a3dc8	Cache compiled regexps when evaluating non-const needles Needles in a (non-const) needle column may repeat and this commit allows to skip compilation for known needles. Out of the different design alternatives (see below, if someone is interested), we now maintain - one global pattern cache, - with a fixed size of 42k elements currently, - and use LRU as eviction strategy. ------------------------------------------------------------------------ (sorry for the wall of text, dumping it here not for reading but just for reference) Write-up about considered design alternatives: 1. Keep the current global cache of const needles. For non-const needles, probe the cache but don't store values in it. Pros: need to maintain just a single cache, no problem with cache pollution assuming there are few distinct constant needles Cons: only useful if a non-const needle occurred as already as a const needle --> overall too simplistic 2. Keep the current global cache for const needles. For non-const needles, create a local (e.g. per-query) cache Pros: unlike (1.), non-const needles can be skipped even if they did not occur yet, no pollution of the const pattern cache when there are very many non-const needles (e.g. large / highly distinct needle columns). Cons: caches may explode "horizontally", i.e. we'll end up with the const cache + caches for Q1, Q2, ... QN, this makes it harder to control the overall space consumption, also patterns residing in different caches cannot be reused between queries, another difficulty is that the concept of "query" does not really exist at matching level - there are only column chunks and we'd potentially end up with 1 cache / chunk 3. Queries with const and non-const needles insert into the same global cache. Pros: the advantages of (2.) + allows to reuse compiled patterns accross parallel queries Cons: needs an eviction strategy to control cache size and pollution (and btw. (2.) also needs eviction strategies for the individual caches) 4. Queries with const needle use global cache, queries with non-const needle use a different global cache --> Overall similar to (3) but ignores the (likely) edge case that const and non-const needles overlap. In sum, (3.) seems the simplest and most beneficial approach. Eviction strategies: 0. Don't ever evict --> cache may grow infinitely and eventually make the system unusable (may even pose a DoS risk) 1. Flush the cache after a certain threshold is exceeded --> very simple but may lead to peridic performance drops 2. Use LRU --> more graceful performance degradation at threshold but comes with a (constant) performance overhead to maintain the LRU queue In sum, given that the pattern compilation in RE2 should be quite costly (pattern-to-DFA/NFA), LRU may be acceptable.	2022-05-25 22:04:06 +02:00
Robert Schulze	ea60a614d2	Decrease namespace indent	2022-05-25 21:56:35 +02:00
alesapin	c7b16065e1	Merge with master	2022-05-25 21:47:05 +02:00
Nikolai Kochetov	6d4a26afac	Update ReadProgressCallback.	2022-05-25 19:45:48 +00:00
Alexey Milovidov	abf2558fba	Merge pull request #37491 from ClickHouse/match_refactoring Refactorings of LIKE/MATCH code	2022-05-25 22:05:38 +03:00
Alexey Milovidov	4482da9eb6	Update greatCircleDistance.cpp	2022-05-25 21:59:31 +03:00
alesapin	6f5c86e55e	Merge branch 'master' into i_object_storage	2022-05-25 20:49:01 +02:00
Alexander Tokmakov	779e6ea0b9	make it better, fix on cluster queries	2022-05-25 20:17:49 +02:00
Nikolai Kochetov	54d7e4139f	Fix build.	2022-05-25 18:16:48 +00:00
alesapin	51868a9a4f	Merge pull request #37424 from metahys/fix_fetching_part_deadlock (only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part	2022-05-25 20:15:41 +02:00
Azat Khuzhin	a813f5996e	Fix converting types for UNION queries (may produce LOGICAL_ERROR) CI founds [1]: 2022.02.20 15:14:23.969247 [ 492 ] {} <Fatal> BaseDaemon: (version 22.3.1.1, build id: 6082C357CFA6FF99) (from thread 472) (query_id: a5187ff9-962a-4e7c-86f6-8d48850a47d6) (query: SELECT 0., round(avgWeighted(x, y)) FROM (SELECT toDate(toDate('214748364.8', '-922337203.6854775808', '-0.1', NULL) - NULL, 10.000100135803223, '-2147483647'), 255 AS x, -2147483647 AS y UNION ALL SELECT y, NULL AS x, 2147483646 AS y)) Received signal Aborted (6) [1]: https://s3.amazonaws.com/clickhouse-test-reports/0/26d0e5438c86e52a145aaaf4cb523c399989a878/fuzzer_astfuzzerdebug,actions//report.html The problem is that subqueries returns different headers: - first query -- x, y - second query -- y, x v2: Make order of columns strict only for UNION https://s3.amazonaws.com/clickhouse-test-reports/34775/9cc8c01a463d18c471853568b2f0af659a4e643f/stateless_tests__address__actions__[2/2].html Fixes: 00597_push_down_predicate_long Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-05-25 20:31:47 +03:00
Nikolai Kochetov	ff98c24d44	Merge pull request #37048 from Avogar/fix-array-map-nothing Add default implementation for Nothing in functions	2022-05-25 19:10:40 +02:00
Nikolai Kochetov	1f9b1cf726	Fixing build.	2022-05-25 18:59:46 +02:00
alesapin	0a3597da72	Merge pull request #34915 from ianton-ru/MDB-16962 Fix collision of S3 operation log revision	2022-05-25 18:15:31 +02:00
Yakov Olkhovskiy	6692b9c2ed	showCertificate function implementation	2022-05-25 12:11:44 -04:00
Alexey Milovidov	cb92482ca5	Merge pull request #37484 from kitaisreal/function-has-all-avx2-dynamic-dispatch Function hasAll added dynamic dispatch	2022-05-25 19:05:32 +03:00
Nikolai Kochetov	7b681fa8ac	Fixing build.	2022-05-25 17:15:23 +02:00
avogar	ede6e2f433	Add docs for settings	2022-05-25 15:10:20 +00:00
avogar	4c9812d4c1	Allow to skip some of the first rows in CSV/TSV formats	2022-05-25 15:00:11 +00:00
KinderRiven	a33c7ce648	fix	2022-05-25 22:58:47 +08:00
Anton Popov	16e839ac71	add profile events for introspection of part types	2022-05-25 14:54:49 +00:00
Antonio Andelic	6a962549d5	Revert "Add support for preprocessing ZooKeeper operations in `clickhouse-keeper`"	2022-05-25 16:45:32 +02:00
Nikolai Kochetov	1b85f2c1d6	Merge branch 'master' into refactor-read-metrics-and-callbacks	2022-05-25 16:27:40 +02:00
msaf1980	fda6ddeffa	cleanup StorageHDFS (unused variables)	2022-05-25 19:23:05 +05:00
mergify[bot]	73662b4436	Merge branch 'master' into fix_fetching_part_deadlock	2022-05-25 14:22:35 +00:00
Maksim Kita	28355114c0	Fixed tests	2022-05-25 16:19:29 +02:00
Nikolai Kochetov	6370c29049	Use a separate mutex for query_factories_info in Context.	2022-05-25 14:16:59 +00:00
Maksim Kita	e67b3537f7	Functions normalizeUTF8 unstable performance tests fix	2022-05-25 15:54:52 +02:00
KinderRiven	875557abc2	fix	2022-05-25 21:53:28 +08:00
KinderRiven	adbb821176	fix	2022-05-25 21:05:15 +08:00
mergify[bot]	f49552d48e	Merge branch 'master' into grouping-sets-optimization-fix	2022-05-25 13:03:54 +00:00
Maksim Kita	45da28ecae	Improve performance of geo distance functions	2022-05-25 14:22:22 +02:00
KinderRiven	2211c1ddb8	fix	2022-05-25 20:15:43 +08:00
taiyang-li	a7a816dcb6	fix build error	2022-05-25 19:55:11 +08:00
Maksim Kita	83554d1f2d	Fixed style	2022-05-25 13:05:39 +02:00
Maksim Kita	fbec38ddb9	Fixed performance tests	2022-05-25 12:51:21 +02:00
KinderRiven	d0fcffec66	fix style	2022-05-25 17:51:03 +08:00
Maksim Kita	c372c3d6aa	Fix performance tests	2022-05-25 11:49:59 +02:00
Maksim Kita	9a9df26eec	Fixed tests	2022-05-25 11:44:37 +02:00
Maksim Kita	6c033f340b	Fixed tests	2022-05-25 11:44:37 +02:00
Maksim Kita	0e5f13e53e	MergingSortedAlgorithm single column specialization	2022-05-25 11:44:37 +02:00
KinderRiven	1ce219bae2	fix	2022-05-25 17:24:38 +08:00
taiyang-li	1d9f65a7d4	Merge branch 'master' into async_hdfs_read_buffer	2022-05-25 17:10:22 +08:00
Kseniia Sumarokova	b50d4549c9	Merge pull request #37356 from amosbird/partition-prune-for-s3 "Partition pruning" for s3	2022-05-25 11:03:07 +02:00
KinderRiven	e3f76cab55	impl improve remote fs cache	2022-05-25 16:54:28 +08:00
avogar	f782fa31c6	Merge branch 'master' of github.com:ClickHouse/ClickHouse into check-format-on-storage-creation	2022-05-25 08:42:54 +00:00
Robert Schulze	05e4fa7df1	Fix special case of trivial regexp Previously, we would alsays set 1 in case of a trivial regex (which is correct). If someone in future builds a negated operator, then this will produce wrong results. Right now, negation of regexp (SQL: NOT MATCH) is implemented at a higher level, so we are safe and this is more a preventive fix.	2022-05-25 10:05:55 +02:00
Robert Schulze	01ab7b9bad	Pass strings in some places as string_view The original goal was to get change const auto & needle = String( reinterpret_cast<const char >(cur_needle_data), cur_needle_length); in Functions/MatchImpl.h into a std::string_view to save an allocation + copy. The needle is eventually passed as search pattern into the re2 library. Re2 has an alternative constructor taking a const char i.e. a NULL-terminated string. Here, the needle is NULL-terminated but 1. this is only because it is passed inside a ColumnString yet this is not always the case (e.g. fixed string columns has a dense layout w/o NULL terminator). 2. assuming NULL termination for users != MatchImpl of the regex code is too dangerous. So, for now we'll stay with copying to be on the safe side. One fine day when re2 has a ptr/size ctor, we can use std::string_view. Just changing a few other places from std::string to std::string_view but this will not help with performance.	2022-05-25 10:05:51 +02:00
Robert Schulze	e8c96777f6	Make OptimizedRegularExpression::analyze() private	2022-05-25 10:05:45 +02:00
Robert Schulze	040fbf3686	Tighter sanity checks in matching code	2022-05-25 10:05:06 +02:00
Robert Schulze	35bef17302	Introduce variables to hold the match result --> nicer when debugging	2022-05-25 10:04:47 +02:00
Robert Schulze	b044d44fef	Refactoring: Make template instantiation easier to read - introduced class MatchTraits with enums that replace bool template parameters - (minor: made negation the last template parameters because negation executes last during evaluation)	2022-05-25 10:03:58 +02:00
Bharat Nallan Chakravarthy	57cfc0bd04	check for validity of h3 index	2022-05-25 06:17:15 +05:30
Alexey Milovidov	516fba27dc	Merge branch 'master' into allow-setuid-inside-clickhouse	2022-05-24 23:31:14 +02:00

... 4 5 6 7 8 ...

27256 Commits