ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-02 12:32:04 +00:00

Author	SHA1	Message	Date
Nikolai Kochetov	b877c484d2	Merge pull request #45481 from ClickHouse/fix-deadlock-with-allow_asynchronous_read_from_io_pool_for_merge_tree Fix possible deadlock with allow_asynchronous_read_from_io_pool_for_merge_tree in case of exception from ThreadPool::schedule	2023-01-21 12:05:34 +01:00
Nikolai Kochetov	ec1e2436cc	Merge pull request #45450 from ClickHouse/fix-disabled-two-level-agg Fix disabled two-level aggregation from HTTP	2023-01-21 12:01:59 +01:00
Sema Checherinda	962894afc8	Merge pull request #44909 from CheSema/intersect-prev-part Do not merge over a gap with outdated undeleted parts	2023-01-21 11:51:21 +01:00
Maksim Kita	47385a19e7	Remove unnecessary getTotalRowCount function calls	2023-01-21 11:27:07 +01:00
Azat Khuzhin	a64f6b5f3e	Fix possible (likely distributed) query hung Recently I saw the following, the client executed long distributed query and terminated the connection, and in this case query cancellation will be done from PullingAsyncPipelineExecutor dtor, but during cancellation one of nodes sent ECONNRESET, and this leads to an exception from PullingAsyncPipelineExecutor::cancel(), and this leads to a deadlock when multiple threads waits each others, because cancel() for LazyOutputFormat wasn't called. Here is as relevant portion of logs: 2023.01.04 08:26:09.236208 [ 37968 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Debug> executeQuery: (from 10.61.13.253:44266, user: default) TooLongDistributedQueryToPost ... 2023.01.04 08:26:09.262424 [ 37968 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part 9_330_538_18, approx. 61440 rows starting from 0 2023.01.04 08:26:09.266399 [ 26788 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Trace> Connection (s4.ch:9000): Connecting. Database: (not specified). User: default 2023.01.04 08:26:09.266849 [ 26788 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Trace> Connection (s4.ch:9000): Connected to ClickHouse server version 22.10.1. 2023.01.04 08:26:09.267165 [ 26788 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Debug> Connection (s4.ch:9000): Sent data for 2 scalars, total 2 rows in 3.1587e-05 sec., 62635 rows/sec., 68.00 B (2.03 MiB/sec.), compressed 0.4594594594594595 times to 148.00 B (4.41 MiB/sec.) 2023.01.04 08:39:13.047170 [ 37968 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Error> PullingAsyncPipelineExecutor: Code: 210. DB::NetException: Connection reset by peer, while writing to socket (10.7.142.115:9000). (NETWORK_ERROR), Stack trace (when copying this message, always include the lines below): 0. ./.build/./contrib/libcxx/include/exception:133: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x1818234c in /usr/lib/debug/usr/bin/clickhouse.debug 1. ./.build/./src/Common/Exception.cpp:69: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x1004fbda in /usr/lib/debug/usr/bin/clickhouse.debug 2. ./.build/./src/Common/NetException.h:12: DB::WriteBufferFromPocoSocket::nextImpl() @ 0x14e352f3 in /usr/lib/debug/usr/bin/clickhouse.debug 3. ./.build/./src/IO/BufferBase.h:39: DB::Connection::sendCancel() @ 0x15c21e6b in /usr/lib/debug/usr/bin/clickhouse.debug 4. ./.build/./src/Client/MultiplexedConnections.cpp:0: DB::MultiplexedConnections::sendCancel() @ 0x15c4d5b7 in /usr/lib/debug/usr/bin/clickhouse.debug 5. ./.build/./src/QueryPipeline/RemoteQueryExecutor.cpp:627: DB::RemoteQueryExecutor::tryCancel(char const, std::__1::unique_ptr<DB::RemoteQueryExecutorReadContext, std::__1::default_delete<DB::RemoteQueryExecutorReadContext> >) @ 0x14446c09 in /usr/lib/debug/usr/bin/clickhouse.debug 6. ./.build/./contrib/libcxx/include/__iterator/wrap_iter.h💯 DB::ExecutingGraph::cancel() @ 0x15d2c0de in /usr/lib/debug/usr/bin/clickhouse.debug 7. ./.build/./contrib/libcxx/include/__memory/unique_ptr.h:300: DB::PullingAsyncPipelineExecutor::cancel() @ 0x15d32055 in /usr/lib/debug/usr/bin/clickhouse.debug 8. ./.build/./contrib/libcxx/include/__memory/unique_ptr.h:312: DB::PullingAsyncPipelineExecutor::~PullingAsyncPipelineExecutor() @ 0x15d31f4f in /usr/lib/debug/usr/bin/clickhouse.debug 9. ./.build/./src/Server/TCPHandler.cpp:0: DB::TCPHandler::processOrdinaryQueryWithProcessors() @ 0x15cde919 in /usr/lib/debug/usr/bin/clickhouse.debug 10. ./.build/./src/Server/TCPHandler.cpp:0: DB::TCPHandler::runImpl() @ 0x15cd8554 in /usr/lib/debug/usr/bin/clickhouse.debug 11. ./.build/./src/Server/TCPHandler.cpp:1904: DB::TCPHandler::run() @ 0x15ce6479 in /usr/lib/debug/usr/bin/clickhouse.debug 12. ./.build/./contrib/poco/Net/src/TCPServerConnection.cpp:57: Poco::Net::TCPServerConnection::start() @ 0x18074f07 in /usr/lib/debug/usr/bin/clickhouse.debug 13. ./.build/./contrib/libcxx/include/__memory/unique_ptr.h:54: Poco::Net::TCPServerDispatcher::run() @ 0x180753ed in /usr/lib/debug/usr/bin/clickhouse.debug 14. ./.build/./contrib/poco/Foundation/src/ThreadPool.cpp:213: Poco::PooledThread::run() @ 0x181e3807 in /usr/lib/debug/usr/bin/clickhouse.debug 15. ./.build/./contrib/poco/Foundation/include/Poco/SharedPtr.h:156: Poco::ThreadImpl::runnableEntry(void) @ 0x181e1483 in /usr/lib/debug/usr/bin/clickhouse.debug 16. ? @ 0x7ffff7e55fd4 in ? 17. ? @ 0x7ffff7ed666c in ? (version 22.10.1.1) And here is the state of the threads: <details> <summary>system.stack_trace</summary> ```sql SELECT arrayStringConcat(arrayMap(x -> demangle(addressToSymbol(x)), trace), '\n') AS sym FROM system.stack_trace WHERE query_id = 'f2ed6149-146d-4a3d-874a-b0b751c7b567' SETTINGS allow_introspection_functions=1 Row 1: ────── sym: pthread_cond_wait std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) bool ConcurrentBoundedQueue<DB::Chunk>::emplaceImpl<DB::Chunk>(std::__1::optional<unsigned long>, DB::Chunk&&) DB::IOutputFormat::work() DB::ExecutionThreadContext::executeTask() DB::PipelineExecutor::executeStepImpl(unsigned long, std::__1::atomic<bool>) Row 2: ────── sym: pthread_cond_wait Poco::EventImpl::waitImpl() DB::PipelineExecutor::joinThreads() DB::PipelineExecutor::executeImpl(unsigned long) DB::PipelineExecutor::execute(unsigned long) Row 3: ────── sym: pthread_cond_wait Poco::EventImpl::waitImpl() DB::PullingAsyncPipelineExecutor::Data::~Data() DB::PullingAsyncPipelineExecutor::~PullingAsyncPipelineExecutor() DB::TCPHandler::processOrdinaryQueryWithProcessors() DB::TCPHandler::runImpl() DB::TCPHandler::run() Poco::Net::TCPServerConnection::start() Poco::Net::TCPServerDispatcher::run() Poco::PooledThread::run() Poco::ThreadImpl::runnableEntry(void*) ``` </details> Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-21 08:05:56 +01:00
Azat Khuzhin	e2fcf0f072	Catch exception on query cancellation Since we still want to join the thread, yes it will be done in dtor, but this looks better. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-21 08:05:56 +01:00
Azat Khuzhin	0566f72d36	Cleanup PullingAsyncPipelineExecutor::cancel() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-21 08:05:56 +01:00
avogar	eed1db7e07	Fix schema inference in hdfsCluster	2023-01-20 21:17:35 +00:00
Anton Popov	41a199e175	Fix crash when `ListObjects` request fails (#45371 )	2023-01-20 20:10:23 +01:00
Nikolai Kochetov	dcd84c152a	Fix possible deadlock with allow_asynchronous_read_from_io_pool_for_merge_tree in case of exception from ThreadPool::schedule	2023-01-20 18:57:47 +00:00
Robert Schulze	e6167d6b36	Deprecate Gorilla compression of non-float columns Reasons: 1. The original Gorilla paper proposed a compression schema for pairs of time stamps and double-precision FP values. ClickHouse's Gorilla codec only implements compression of the latter and it does not impose any data type restrictions. - Data types != Float* or (U)Int* (e.g. Decimal, Point etc.) are definitely not supposed to be used with Gorilla. - (U)Int* types are debatable. The paper only considers integers-stored-as-FP-values, a practical use case for which Gorilla works well. Standalone integers are not considered which makes them at least suspicious. 2. Achieve consistency with FPC, another specialized floating-point timeseries codec, which rejects non-float data. 3. On practical datasets, ZSTD is often "good enough" (*) so it should be okay to disincentive non-ZSTD codecs a little bit. If needed, Delta and DoubleDelta codecs are viable alternative for slowly changing (time-series-like) integer sequences. Since on-prem and hosted users may still have Gorilla-compressed non-float data, this combination is only deprecated for now. No warning or error will be emitted. Users are encouraged to migrate Gorilla-compressed non-float data to an alternative codec. It is planned to treat Gorilla-compressed non-float columns as "suspicious" six months after this commit (i.e. in v23.6). Even then, it will still be possible to set "allow_suspicious_codecs = true" and read and write Gorilla-compressed non-float data. () Sec. 4.1.2, "Gorilla restricts the value element in its tuple to a double floating point type.", https://doi.org/10.14778/2824032.2824078 (**) https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema	2023-01-20 17:31:16 +00:00
robot-ch-test-poll4	2066581d8f	Merge pull request #45451 from evillique/default_granularity Add default GRANULARITY argument for secondary indexes	2023-01-20 17:46:21 +01:00
avogar	86336940f8	Better comment	2023-01-20 16:41:59 +00:00
avogar	4432ee9927	Fix aborts in arrow lib	2023-01-20 16:40:33 +00:00
vdimir	e30ab0874b	Review fixes	2023-01-20 16:30:34 +00:00
Alexander Tokmakov	910d6dc0ce	Merge pull request #45342 from ClickHouse/exception_message_patterns Save message format strings for DB::Exception	2023-01-20 18:46:52 +03:00
Kseniia Sumarokova	01320da02b	Update BoundedReadBuffer.cpp	2023-01-20 16:25:02 +01:00
ltrk2	810c9ba50c	Produce a null map of the correct size	2023-01-20 10:24:42 -05:00
ltrk2	9d798ea1bc	Document functions	2023-01-20 10:24:42 -05:00
ltrk2	65b9c69c90	Introduce non-throwing variants of hasToken	2023-01-20 10:24:42 -05:00
avogar	550a703fbc	Make a bit better	2023-01-20 14:58:39 +00:00
Antonio Andelic	136e4ec1b3	Merge pull request #45273 from azat/fix-test-log-level Fix log level "Test" for send_logs_level in client	2023-01-20 15:36:05 +01:00
Alexander Tokmakov	ec5d7d0a3a	Update src/Functions/FunctionsConversion.h Co-authored-by: Alexander Gololobov <440544+davenger@users.noreply.github.com>	2023-01-20 17:33:01 +03:00
Kruglov Pavel	28ddcc2432	Merge branch 'master' into tsv-csv-detect-header	2023-01-20 15:08:38 +01:00
Sema Checherinda	b76b612d23	fix typo	2023-01-20 14:55:58 +01:00
Nikolai Kochetov	039901b395	Fixing build	2023-01-20 13:49:50 +00:00
Robert Schulze	1a966a9590	Fix bad comparison	2023-01-20 13:05:06 +00:00
Sema Checherinda	02f22f04e8	fix typos	2023-01-20 13:35:23 +01:00
kssenii	8d20af8127	Fix	2023-01-20 13:34:23 +01:00
Azat Khuzhin	bdeb5514c5	Fix ASan builds for glibc 2.36+ (use RTLD_NEXT for ThreadFuzzer interceptors) Recently I noticed that clickhouse compiled with ASan does not work with newer glibc 2.36+, before I though that this was only about compiling with old but using new, however that was not correct, ASan simply does not work with glibc 2.36+. Here is a simple reproducer [1]: $ cat > test-asan.cpp <<EOL #include <pthread.h> int main() { // something broken in ASan in interceptor for __pthread_mutex_lock // and only since glibc 2.36, and for pthread_mutex_lock everything is OK pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; return __pthread_mutex_lock(&mutex); } EOL $ clang -g3 -o test-asan test-asan.cpp -fsanitize=address $ ./test-asan AddressSanitizer:DEADLYSIGNAL ================================================================= ==15659==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7fffffffccb0 sp 0x7fffffffcb98 T0) ==15659==Hint: pc points to the zero page. ==15659==The signal is caused by a READ memory access. ==15659==Hint: address points to the zero page. #0 0x0 (<unknown module>) #1 0x7ffff7cda28f (/usr/lib/libc.so.6+0x2328f) (BuildId: 1e94beb079e278ac4f2c8bce1f53091548ea1584) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (<unknown module>) ==15659==ABORTING [1]: https://gist.github.com/azat/af073e57a248e04488b21068643f079e I've started observing glibc code, there was some changes in glibc, that moves pthread functions out from libpthread.so.0 into libc.so.6 (somewhere between 2.31 and 2.35), but the problem pops up only with 2.36, 2.35 works fine. After this I've looked into changes between 2.35 and 2.36, and found this patch [2] - "dlsym: Make RTLD_NEXT prefer default version definition [BZ #14932]", that fixes this bug [3]. [2]: https://sourceware.org/git/?p=glibc.git;a=commit;h=efa7936e4c91b1c260d03614bb26858fbb8a0204 [3]: https://sourceware.org/bugzilla/show_bug.cgi?id=14932 The problem with using DL_LOOKUP_RETURN_NEWEST flag for RTLD_NEXT is that it does not resolve hidden symbols (and __pthread_mutex_lock is indeed hidden). Here is a sample that will show the difference [4]: $ cat > test-dlsym.c <<EOL #define _GNU_SOURCE #include <dlfcn.h> #include <stdio.h> int main() { void *p = dlsym(RTLD_NEXT, "__pthread_mutex_lock"); printf("__pthread_mutex_lock: %p (via RTLD_NEXT)\n", p); return 0; } EOL # glibc 2.35: __pthread_mutex_lock: 0x7ffff7e27f70 (via RTLD_NEXT) # glibc 2.36: __pthread_mutex_lock: (nil) (via RTLD_NEXT) [4]: https://gist.github.com/azat/3b5f2ae6011bef2ae86392cea7789eb7 But ThreadFuzzer uses internal symbols to wrap pthread_mutex_lock/pthread_mutex_unlock, which are intercepted by ASan and this leads to NULL dereference. The fix was obvious - just use dlsym(RTLD_NEXT), however on older glibc's this leads to endless recursion (see commits in the code). But only for jemalloc [5], and even though sanitizers does not uses jemalloc the code of ThreadFuzzer is generic and I don't want to guard it with more preprocessors macros. [5]: https://gist.github.com/azat/588d9c72c1e70fc13ebe113197883aa2 So we have to use RTLD_NEXT only for ASan. There is also one more interesting issue, if you will compile with clang that itself had been compiled with newer libc (i.e. 2.36), you will get the following error: $ podman run --privileged -v $PWD/.cmake-asan/programs:/root/bin -e PATH=/bin:/root/bin -e --rm -it ubuntu-dev-v3 clickhouse ==1==ERROR: AddressSanitizer failed to allocate 0x0 (0) bytes of SetAlternateSignalStack (error code: 22) ... ==1==End of process memory map. AddressSanitizer: CHECK failed: sanitizer_common.cpp:53 "((0 && "unable to mmap")) != (0)" (0x0, 0x0) (tid=1) <empty stack> The problem is that since GLIBC_2.31, `SIGSTKSZ` is a call to `getconf(_SC_MINSIGSTKSZ)`, but older glibc does not have it, so `-1` will be returned and used as `SIGSTKSZ` instead. The workaround to disable alternative stack: $ podman run --privileged -v $PWD/.cmake-asan/programs:/root/bin -e PATH=/bin:/root/bin -e ASAN_OPTIONS=use_sigaltstack=0 --rm -it ubuntu-dev-v3 clickhouse client --version ClickHouse client version 22.13.1.1. Fixes: #43426 Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-20 13:09:13 +01:00
Robert Schulze	bfc3b4f5ca	Suffix "GinFilter" --> "Inverted"	2023-01-20 12:02:35 +00:00
Nikolai Kochetov	1e29993aef	Fixing build	2023-01-20 11:55:20 +00:00
Robert Schulze	0738b2499c	Use GinFilters typedef where possible	2023-01-20 11:52:04 +00:00
Maksim Kita	3e08a98f16	Merge pull request #45388 from azat/dict/remove-preallocate Remove PREALLOCATE for HASHED/SPARSE_HASHED dictionaries	2023-01-20 14:51:25 +03:00
Robert Schulze	0b77f07f67	Remove superfluous check (the same is checked in MergeTreeIndices.cpp)	2023-01-20 11:50:35 +00:00
Robert Schulze	d2c830ec39	Cosmetics	2023-01-20 11:49:08 +00:00
Robert Schulze	72973076c9	Rename MergeTreeIndexGin.h/cpp to MergeTreeIndexInverted.h/cpp	2023-01-20 11:42:36 +00:00
Robert Schulze	1ef2704539	Cosmetics	2023-01-20 11:39:23 +00:00
Anton Popov	9c0ba7c7ca	Merge pull request #45432 from CurtizJ/allow-json-extract-int-from-float Allow to convert float stored in string field to integer in `JSONExtract`	2023-01-20 12:35:06 +01:00
Robert Schulze	463cc843de	"segment file" --> "segment metadata file"	2023-01-20 11:26:22 +00:00
Robert Schulze	58df3953bb	Move some code around (no other changes)	2023-01-20 11:24:23 +00:00
Kseniia Sumarokova	c066b9bddd	Update SwapHelper.h	2023-01-20 12:19:19 +01:00
Maksim Kita	e067a55b78	Fixed tests	2023-01-20 12:19:16 +01:00
Robert Schulze	3267ac2787	Prefix more typedefs in DB namespace with "Gin"	2023-01-20 11:19:07 +00:00
Robert Schulze	919b67f117	Cosmetics	2023-01-20 11:15:28 +00:00
Sema Checherinda	09f3a5c599	add a comment, add a check, fix test	2023-01-20 12:10:31 +01:00
Robert Schulze	98e117dca6	SegmentDictionary --> GinSegmentDictionary, also move typedef	2023-01-20 11:09:49 +00:00
Robert Schulze	908fa83f72	Move some typedefs around	2023-01-20 11:08:19 +00:00
Robert Schulze	44618927f9	Inline two short methods + uppercase	2023-01-20 11:04:35 +00:00
Robert Schulze	f8b446f517	Move method implementations (no other changes)	2023-01-20 10:57:16 +00:00
Robert Schulze	5c3cc5283f	"term dictionary" --> "dictionary"	2023-01-20 10:53:41 +00:00
Robert Schulze	be936b257c	Make version enum private	2023-01-20 10:48:43 +00:00
Robert Schulze	0653f86de9	Various cosmetic cleanups	2023-01-20 10:45:35 +00:00
Maksim Kita	23e26032ca	Merge pull request #45399 from aalexfvk/alexfvk/mdb-21326_fix_system_dictionaries_when_dictionary_with_bad_structure Fix select from system.dictionaries when there is dictionary with bad structure	2023-01-20 13:36:32 +03:00
Maksim Kita	758c8f2776	Merge branch 'master' into dict/remove-preallocate	2023-01-20 13:15:37 +03:00
Maksim Kita	e6ee5554d1	Fixed tests	2023-01-20 11:15:13 +01:00
Azat Khuzhin	1f9a65b875	Modernize InternalTextLogsQueue::getPriorityName() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-20 11:09:35 +01:00
Azat Khuzhin	fc276abadd	Fix log level "Test" for send_logs_level in client Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-20 11:09:35 +01:00
Antonio Andelic	0ad37ad286	Merge pull request #45320 from stigsb/system_tables_volume_config Add <storage_policy> config parameter for system logs	2023-01-20 10:27:57 +01:00
Aleksandr Musorin	838acb22b7	added num_processed_files and processed_files_size	2023-01-20 10:20:41 +01:00
Robert Schulze	5ec6d89d43	Merge pull request #38667 from ClibMouse/ftsearch Inverted Indices Implementation	2023-01-20 10:18:05 +01:00
SmitaRKulkarni	6aa63414db	Merge pull request #45072 from ClickHouse/43891_Disallow_concurrent_backups_and_restores Added settings to disallow concurrent backups and restores	2023-01-20 09:17:20 +01:00
Nikolai Kochetov	3e00d18498	Merge branch 'master' into fix-disabled-two-level-agg	2023-01-19 20:54:04 +00:00
Nikolay Degterinsky	dd7fef11a2	Add default granularity	2023-01-19 20:52:38 +00:00
Nikolai Kochetov	d24be2712e	Fix disabled two-level aggregation from HTTP	2023-01-19 20:50:27 +00:00
Maksim Kita	3363f7c718	Added GroupingFunctionsResolvePass	2023-01-19 19:06:02 +01:00
Maksim Kita	506f91b841	Fixed tests	2023-01-19 19:05:49 +01:00
Maksim Kita	2c56b0b2b9	Planner small fixes	2023-01-19 19:05:49 +01:00
Kseniia Sumarokova	ad4a9d2880	Update SwapHelper.h	2023-01-19 18:58:09 +01:00
kssenii	f56f515392	Fix	2023-01-19 18:45:06 +01:00
Anton Popov	089d1f5b62	fix fuzzer	2023-01-19 17:03:24 +00:00
kssenii	4ce8950712	Minor changes	2023-01-19 17:53:10 +01:00
larryluogit	52ae33dba7	Merge branch 'master' into ftsearch	2023-01-19 11:34:11 -05:00
avogar	c34c0aa22e	Fix comments	2023-01-19 16:03:46 +00:00
Han Fei	3007507a8b	Merge pull request #45428 from hanfei1991/hanfei/fix-empty-expressions fix regexp logical error in stress tests	2023-01-19 16:39:39 +01:00
Kruglov Pavel	9820beae68	Apply suggestions from code review Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>	2023-01-19 16:11:13 +01:00
Anton Popov	4ca359d57b	Merge pull request #45418 from CurtizJ/fix-disk-encrypted Fix reading from encrypted disk with passed file size	2023-01-19 16:11:08 +01:00
Anton Popov	7f2e37860d	allow to convert float stored in string field to integer in JSONExtract	2023-01-19 14:24:55 +00:00
Aleksei Filatov	afada0ecb3	Fix review notes	2023-01-19 17:02:57 +03:00
Alexander Tokmakov	7bb65cc002	Update StorageReplicatedMergeTree.cpp	2023-01-19 16:45:41 +03:00
Igor Nikonov	d0ce804bfc	Fix: dynamic_cast -> typeid_cast for SortingStep	2023-01-19 13:40:21 +00:00
Han Fei	94336a9b66	fix typo	2023-01-19 13:55:29 +01:00
Igor Nikonov	df3776d24b	Make test stable + disable debug logging	2023-01-19 11:43:40 +00:00
Han Fei	2884b8837b	fix regexp logical error in stress tests	2023-01-19 12:03:54 +01:00
SmitaRKulkarni	67e2bf31f5	Merge branch 'master' into 43891_Disallow_concurrent_backups_and_restores	2023-01-19 11:21:37 +01:00
Han Fei	f661dad0e9	Merge pull request #45106 from hanfei1991/hanfei/async-cache support cache for async inserts block ids	2023-01-19 10:59:25 +01:00
Ilya Yatsishin	d16b59b662	Merge pull request #45422 from Avogar/fix-s3-cluser-si	2023-01-19 10:36:54 +01:00
Ilya Yatsishin	00962b7ad5	Merge pull request #45424 from Avogar/fix-json-import-nested	2023-01-19 10:31:40 +01:00
Stig Bakken	420c179b55	Add <storage_policy> config parameter for system logs	2023-01-19 10:25:28 +01:00
SmitaRKulkarni	db03dd1bb9	Merge branch 'master' into 43891_Disallow_concurrent_backups_and_restores	2023-01-19 09:32:50 +01:00
Maksim Kita	911bb8e6ab	Merge pull request #45410 from ClickHouse/revert-45406-revert-42797-or-like-chain Resubmit Support optimize_or_like_chain in QueryTreePassManager	2023-01-19 11:30:45 +03:00
Yakov Olkhovskiy	c6ee4c3908	Merge pull request #44686 from Algunenano/fix_uuid_parsing_in_values Don't parse beyond the quotes when reading UUIDs	2023-01-18 19:30:53 -05:00
Igor Nikonov	57d2fd300a	Fix: correct update of data stream sorting properties after removing sorting	2023-01-19 00:11:58 +00:00
Yakov Olkhovskiy	1d58ded72b	fix IP parsers to treat input as not whole string	2023-01-19 00:08:20 +00:00
avogar	a8f20363f4	Fix JSON/BSONEachRow parsing with HTTP	2023-01-18 22:49:03 +00:00
avogar	117ec13c9e	Fix s3Cluster schema inference when structure from insertion table is used	2023-01-18 20:33:50 +00:00
Azat Khuzhin	4366f7fb3b	Remove PREALLOCATE for HASHED/SPARSE_HASHED dictionaries It does not give significant benefit, but now, you hashed/sparse_hashed dictionaries can be filled in parallel (#40003), using sharded dictionaries, and this should be used instead of PREALLOCATE. Note, that dictionaries, that had been created with PREALLOCATE will work, but simply ignore this attribute. Fixes: #41985 (cc @alexey-milovidov) Reverts: #23979 (cc @kitaisreal) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-18 20:18:37 +01:00
Igor Nikonov	1866f990de	Revert "Revert "Remove redundant sorting""	2023-01-18 20:12:34 +01:00
Anton Popov	65a71b4431	fix reading from encrypted disk	2023-01-18 19:02:20 +00:00
Dmitry Novik	fff9fd4f00	Remove redundant group by keys with constants	2023-01-18 17:44:06 +00:00
Igor Nikonov	7ed8fec94f	Revert "Remove redundant sorting"	2023-01-18 18:38:25 +01:00
Dmitry Novik	11701d0ff5	Resolve OR function after modification	2023-01-18 17:17:16 +00:00
Dmitry Novik	df26f4fc37	Revert "Revert "Support optimize_or_like_chain in QueryTreePassManager""	2023-01-18 18:14:03 +01:00
Anton Popov	5df0f91857	Revert "Support optimize_or_like_chain in QueryTreePassManager"	2023-01-18 17:34:19 +01:00
Maksim Kita	cabcc761ed	Merge pull request #45357 from kitaisreal/analyzer-compound-identifier-typo-correction-fix Analyzer compound identifier typo correction fix	2023-01-18 17:59:32 +03:00
Aleksei Filatov	5e9340f682	Add integration test	2023-01-18 17:50:38 +03:00
Aleksei Filatov	7f4a01b903	Add handling of bad dictionary structure	2023-01-18 17:27:03 +03:00
Sema Checherinda	ae1dfb9ce5	Update src/Storages/MergeTree/MergeTreeData.cpp Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>	2023-01-18 15:21:11 +01:00
Sema Checherinda	a344b526a6	Update src/Storages/StorageMergeTree.cpp Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>	2023-01-18 15:16:18 +01:00
Alexander Tokmakov	7a824af09e	fix	2023-01-18 14:30:20 +01:00
Antonio Andelic	8f8b14148a	Merge pull request #45215 from ClickHouse/fix-crash-kv-store Fix crash when prepared set with different type used in KV stores	2023-01-18 13:27:40 +01:00
Igor Nikonov	72066846cf	Merge pull request #43905 from ClickHouse/igor/remove_redundant_order_by Remove redundant sorting	2023-01-18 13:25:03 +01:00
vdimir	b76779797a	Do not move to prewhere in select with joins	2023-01-18 12:17:30 +00:00
Vitaly Baranov	7cdb2c4c7f	Merge pull request #45351 from vitlibar/fix-backup-with-killed-mutations Fix backup with killed mutations	2023-01-18 13:14:27 +01:00
Han Fei	e51123c9b0	fix data race	2023-01-18 13:11:07 +01:00
Maksim Kita	8225d2814c	Merge pull request #40003 from azat/dict-shards Add ability to load hashed dictionaries using multiple threads	2023-01-18 13:37:10 +03:00
Maksim Kita	3a550691c9	Merge pull request #42797 from ClickHouse/or-like-chain Support optimize_or_like_chain in QueryTreePassManager	2023-01-18 13:09:33 +03:00
Maksim Kita	21b94813ad	Fixed code review issues	2023-01-18 11:02:29 +01:00
Maksim Kita	cacaa2372a	Merge pull request #43261 from ClickHouse/group-by-function-elimination Support optimize_group_by_function_keys on top of QueryTree	2023-01-18 12:55:56 +03:00
Maksim Kita	21b288c620	Fixed build	2023-01-18 10:44:40 +01:00
Antonio Andelic	cfba9b19eb	Merge pull request #45360 from azat/dist/fix-startup-race Fix race in Distributed table startup	2023-01-18 10:09:54 +01:00
Antonio Andelic	f57ee043ae	Merge pull request #45319 from ClickHouse/disable-prewhere-in-merge-different-types Disable PREWHERE in storage Merge if types don't match	2023-01-18 10:02:06 +01:00
Antonio Andelic	f3469ee077	Merge branch 'master' into dist/fix-startup-race	2023-01-18 09:44:52 +01:00
Smita Kulkarni	d7ca742d98	Fixed style check for beginning of if - Added settings to disallow concurrent backups and restores	2023-01-18 08:59:47 +01:00
Dmitry Novik	3b0ac7272c	Update reference files	2023-01-18 00:30:30 +00:00
Dmitry Novik	752aed696a	Merge remote-tracking branch 'origin/master' into group-by-function-elimination	2023-01-17 23:33:33 +00:00
Sergei Trifonov	c443c1ece0	Merge branch 'master' into hanfei/async-cache	2023-01-18 00:19:49 +01:00
Robert Schulze	4f90824347	Merge remote-tracking branch 'origin/master' into query-result-cache	2023-01-17 22:49:53 +00:00
Anton Popov	f40fd7a151	Add checks for compilation of regexps (#45356 )	2023-01-17 23:46:04 +01:00
Smita Kulkarni	ee526ce877	Fix style check - Added settings to disallow concurrent backups and restores	2023-01-17 22:52:55 +01:00
Smita Kulkarni	6e06af1b25	Updated strategy for handling internal backups & restores to avoid concurrent internal backups & restores - Added settings to disallow concurrent backups and restores	2023-01-17 22:27:13 +01:00
Igor Nikonov	0db9bf38a2	Merge branch 'master' into igor/remove_redundant_order_by	2023-01-17 22:26:24 +01:00
Alexander Tokmakov	1413b9537c	make error patterns more useful	2023-01-17 20:04:25 +01:00
Alexander Tokmakov	5cd90c1a3e	Merge branch 'master' into exception_message_patterns	2023-01-17 20:04:04 +01:00
Alexander Tokmakov	72e8615bec	formatting of some exception messages	2023-01-17 20:03:56 +01:00
Maksim Kita	4f7f2ed9e1	Merge pull request #45300 from ClickHouse/revert-45299-revert-44882-function-node-validation Revert "Revert "Validate function arguments in query tree""	2023-01-17 21:51:26 +03:00
Maksim Kita	273610ce65	Merge pull request #43640 from ClickHouse/42648_Support_scalar_subqueries_cache Support scalar subqueries cache	2023-01-17 21:31:13 +03:00
serxa	ce7e22b87b	add detailed profile events for throttling	2023-01-17 18:29:24 +00:00
alesapin	e732f510f0	Merge branch 'master' into fix_hang_during_drop_in_zero_copy_replication	2023-01-17 19:24:36 +01:00
Alexander Tokmakov	8b13b85ea0	Merge pull request #44543 from ClickHouse/text_log_add_pattern Add a column with a message pattern to text_log	2023-01-17 20:19:32 +03:00
Vitaly Baranov	1a680b0092	Abort multipart upload faster.	2023-01-17 18:00:11 +01:00
Vitaly Baranov	2de455367a	Fix using std::ios_base::end in StdStreamFromReadBuffer::seekg().	2023-01-17 17:56:14 +01:00
Igor Nikonov	0cfa08df7a	Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by	2023-01-17 16:28:17 +00:00
Igor Nikonov	9855504403	Rename source file according to implementation	2023-01-17 16:24:51 +00:00
Nikita Mikhaylov	0fc755806e	One more attempt to fix race in TCPHandler (#45240 )	2023-01-17 16:17:14 +01:00
alesapin	69925647eb	Fix style	2023-01-17 15:59:55 +01:00
alesapin	f6131101bb	Fix no shared id during drop for the fourth time	2023-01-17 15:51:49 +01:00
Han Fei	8a74238fe0	improve	2023-01-17 15:47:52 +01:00
Kruglov Pavel	96bb99f864	Merge branch 'master' into tsv-csv-detect-header	2023-01-17 15:33:02 +01:00
Kruglov Pavel	582aa8b770	Merge pull request #45253 from Avogar/fix-s3-heap-use-after-free Fix heap-use-after-free in reading from s3	2023-01-17 15:32:26 +01:00
HarryLeeIBM	e7add8218f	Addressed more review comments and ClangTidy errors	2023-01-17 06:29:13 -08:00
Kruglov Pavel	4183f6082f	Fix special build	2023-01-17 15:18:39 +01:00
Azat Khuzhin	54fc6859ae	Fix race in Distributed table startup Before this patch it was possible to have multiple directory monitors for the same directory, one from the INSERT context, another one on storage startup(). Here are an example of logs for this scenario: 2022.12.07 12:12:27.552485 [ 39925 ] {a47fcb32-4f44-4dbd-94fe-0070d4ea0f6b} <Debug> DDLWorker: Executed query: DETACH TABLE inc.dist_urls_in ... 2022.12.07 12:12:33.228449 [ 4408 ] {20c761d3-a46d-417b-9fcd-89a8919dd1fe} <Debug> executeQuery: (from 0.0.0.0:0, user: ) /* ddl_entry=query-0000089229 */ ATTACH TABLE inc.dist_urls_in (stage: Complete) ... this is the DirectoryMonitor created from the context of INSERT for the old StoragePtr that had not been destroyed yet (becase of "was 1" this can be done only from the context of INSERT) ... 2022.12.07 12:12:35.556048 [ 39536 ] {} <Trace> inc.dist_urls_in.DirectoryMonitor: Files set to 173 (was 1) 2022.12.07 12:12:35.556078 [ 39536 ] {} <Trace> inc.dist_urls_in.DirectoryMonitor: Bytes set to 29750181 (was 71004) 2022.12.07 12:12:35.562716 [ 39536 ] {} <Trace> Connection (i13.ch:9000): Connected to ClickHouse server version 22.10.1. 2022.12.07 12:12:35.562750 [ 39536 ] {} <Debug> inc.dist_urls_in.DirectoryMonitor: Sending a batch of 10 files to i13.ch:9000 (0.00 rows, 0.00 B bytes). ... this is the DirectoryMonitor that created during ATTACH ... 2022.12.07 12:12:35.802080 [ 39265 ] {} <Trace> inc.dist_urls_in.DirectoryMonitor: Files set to 173 (was 0) 2022.12.07 12:12:35.802107 [ 39265 ] {} <Trace> inc.dist_urls_in.DirectoryMonitor: Bytes set to 29750181 (was 0) 2022.12.07 12:12:35.834216 [ 39265 ] {} <Debug> inc.dist_urls_in.DirectoryMonitor: Sending a batch of 10 files to i13.ch:9000 (0.00 rows, 0.00 B bytes). ... 2022.12.07 12:12:38.532627 [ 39536 ] {} <Trace> inc.dist_urls_in.DirectoryMonitor: Sent a batch of 10 files (took 2976 ms). ... 2022.12.07 12:12:38.601051 [ 39265 ] {} <Error> inc.dist_urls_in.DirectoryMonitor: std::exception. Code: 1001, type: std::__1::__fs::filesystem::filesystem_error, e.what() = filesystem error: in file_size: No such file or directory ["/data6/clickhouse/data/inc/dist_urls_in/shard13_replica1/66827403.bin"], Stack trace (when copying this message, always include the lines below): ... 2022.12.07 12:12:54.132837 [ 4408 ] {20c761d3-a46d-417b-9fcd-89a8919dd1fe} <Debug> DDLWorker: Executed query: ATTACH TABLE inc.dist_urls_in And eventually both monitors (for a short period of time, one replaces another) are trying to process the same batch (current_batch.txt), and one of them fails because such file had been already removed. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-17 14:51:00 +01:00
Igor Nikonov	6328e02f22	Fix: update input/output stream properties After removing sorting step we need to update sorting properties of input/ouput streams	2023-01-17 13:39:18 +00:00
Maksim Kita	d758d83937	Analyzer compound identifier typo correction fix	2023-01-17 14:29:48 +01:00
vdimir	60acd5e424	fix clang tidy	2023-01-17 12:21:56 +00:00
vdimir	1e9ccfb4b9	wip	2023-01-17 12:21:56 +00:00
vdimir	40bf9939b7	Update JoinSwitcher::switchJoin	2023-01-17 12:21:55 +00:00
vdimir	e0e60bb460	wip	2023-01-17 12:21:55 +00:00
vdimir	4aecb836a9	Fix JoinMask	2023-01-17 12:21:55 +00:00
vdimir	18d751aed4	wip	2023-01-17 12:21:54 +00:00
vdimir	beb8ba7e62	wip	2023-01-17 12:21:54 +00:00
vdimir	57a35cae33	wip	2023-01-17 12:21:53 +00:00
vdimir	efcfcca545	Fix HashJoin::getTotalByteCount caclulation	2023-01-17 12:21:53 +00:00
vdimir	b0c4e18464	Fix double initialization GraceHashJoin::initBuckets	2023-01-17 12:21:53 +00:00
Sema Checherinda	35431e91e3	Merge pull request #45276 from ucasfl/avro-fix Fix some avro reading bugs	2023-01-17 12:48:44 +01:00
Kseniia Sumarokova	5586f71950	Merge pull request #41231 from kssenii/minor-change-in-remote-read Fix assertion in async read buffer from remote	2023-01-17 12:32:57 +01:00
Maksim Kita	d6a36b1d16	Fixed code review issues	2023-01-17 12:02:50 +01:00
Maksim Kita	af716ca25d	Fixed tests	2023-01-17 11:20:24 +01:00
Maksim Kita	250c93614c	Revert "Revert "Validate function arguments in query tree""	2023-01-17 11:20:24 +01:00
Vitaly Baranov	692065e5fe	Fix backup if mutations got killed during the backup process.	2023-01-17 11:05:34 +01:00
Vitaly Baranov	0bea056241	Fix build.	2023-01-17 09:52:08 +01:00
Vitaly Baranov	1c845185c1	Split upload into parts of the same size for smooth uploading. Correctly use AbortMultipleUpload request. Support std::ios_base::end StdStreamBufFromReadBuffer::seekpos().	2023-01-17 09:35:43 +01:00
Vitaly Baranov	14a7ee8e26	Copy files to S3 during backup directly without using WriteBufferFromS3 to decrease memory consumption.	2023-01-17 09:35:41 +01:00
Vitaly Baranov	b13498d9ba	Merge pull request #45288 from vitlibar/fix-s3-requests-without-region Fix s3 requests without region	2023-01-17 09:24:59 +01:00
Antonio Andelic	76eb3e3b3c	Fix test	2023-01-17 07:34:39 +00:00
SmitaRKulkarni	bb4f251448	Merge branch 'master' into 42648_Support_scalar_subqueries_cache	2023-01-17 08:10:25 +01:00
Alexander Tokmakov	522686f78b	less empty patterns	2023-01-17 01:19:44 +01:00
Kseniia Sumarokova	6a02bdc917	Update AsynchronousReadIndirectBufferFromRemoteFS.cpp	2023-01-17 00:37:47 +01:00
Alexander Tokmakov	870cfcc36a	less fmt::runtime usages	2023-01-17 00:11:59 +01:00
Alexander Tokmakov	e7899825e6	save format strings for DB::Exceptions	2023-01-16 23:20:33 +01:00
Vitaly Baranov	9a52087989	More complex logic: GetObjectAttributes requests will be used only if the endpoint is "*.amazonaws.com", otherwise HeadObject requests will be used.	2023-01-16 20:14:39 +01:00
Dmitry Novik	104e55bc22	Merge remote-tracking branch 'origin/master' into or-like-chain	2023-01-16 18:56:22 +00:00
Dmitry Novik	aa2a19eaa4	Use proper map for QueryTreeNode	2023-01-16 18:43:22 +00:00
Dmitry Novik	0aecc9ad80	Updates after the review	2023-01-16 17:43:36 +00:00
Kruglov Pavel	e9d6590926	Merge branch 'master' into tsv-csv-detect-header	2023-01-16 17:50:24 +01:00
Kruglov Pavel	bdb3517512	Merge pull request #45231 from Avogar/json-tuples Insert default values in case of missing tuple elements in JSONEachRow	2023-01-16 17:49:50 +01:00
avogar	1c0941d72a	Add docs and examples	2023-01-16 16:46:41 +00:00
Alexander Tokmakov	df75c24f01	Revert "Disallow Gorilla codec on non-float columns"	2023-01-16 19:14:28 +03:00
avogar	1d26704049	Fix	2023-01-16 15:49:59 +00:00
Sema Checherinda	dbe89cd5d8	fix that optimize final waits for currently running merges	2023-01-16 16:47:12 +01:00
Sema Checherinda	90fa1ecd49	make that old_parts_lifetime=0 deletes files instantly at drop/truncate	2023-01-16 16:47:12 +01:00
Sema Checherinda	8f660afab3	style fix	2023-01-16 16:47:12 +01:00
Sema Checherinda	c51f4d7be1	do not merge over a gap with uotdate parts, delete empty parts with respect to old_parts_lifetime	2023-01-16 16:47:11 +01:00
Sema Checherinda	25e16388d7	better message in MergeTreeDataMergerMutator when parts intersect	2023-01-16 16:47:11 +01:00
Kruglov Pavel	04d95f4877	Fix	2023-01-16 16:47:04 +01:00
avogar	3ea80b0f54	Merge branch 'master' of github.com:ClickHouse/ClickHouse into tsv-csv-detect-header	2023-01-16 15:14:25 +00:00
Antonio Andelic	108b2384e7	Disable prewhere in storage merge if types don't match	2023-01-16 13:39:46 +00:00
Anton Popov	6863cd152f	Merge pull request #42181 from CurtizJ/optimize-loading-parts Do not load inactive parts at startup	2023-01-16 14:38:50 +01:00
Kseniia Sumarokova	57c22f005b	Merge branch 'master' into minor-change-in-remote-read	2023-01-16 14:22:16 +01:00
Kseniia Sumarokova	7b612da871	Update AsynchronousReadIndirectBufferFromRemoteFS.cpp	2023-01-16 14:21:09 +01:00
Kseniia Sumarokova	4d22b49be7	Update DiskObjectStorage.cpp	2023-01-16 14:19:18 +01:00
Han Fei	30a798182a	Merge branch 'master' into hanfei/async-cache	2023-01-16 14:07:36 +01:00
Nikolay Degterinsky	70e79de69b	Merge pull request #38252 from bharatnc/ncb/weighted-quantile-approx add quantileInterpolatedWeighted function	2023-01-16 13:41:13 +01:00
Nikolay Degterinsky	88ba1b0b85	Merge pull request #42884 from evillique/better_asterisk_parser Improve Asterisk and ColumnMatcher parsers	2023-01-16 13:29:59 +01:00
Vladimir C	0337bc7c4d	Merge pull request #45147 from rgzntrade/master	2023-01-16 13:18:18 +01:00
Igor Nikonov	a34991cb65	Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by	2023-01-16 12:14:02 +00:00
Alexander Tokmakov	ee888f7f38	Merge pull request #44547 from ClickHouse/fix_44496 Fix too aggressive evaluation of args in default column expr	2023-01-16 15:08:58 +03:00
Kseniia Sumarokova	d859976fbd	Merge pull request #45250 from ClickHouse/43188_Record_startup_time_in_profileevents Record server startup time in ProfileEvents	2023-01-16 12:20:37 +01:00
alesapin	190c9b3156	Merge pull request #44682 from hanfei1991/hanfei/support-advance-dedup deduplicate async inserts in the same block earlier	2023-01-16 12:19:30 +01:00
Maksim Kita	0c7e0be0b6	Analyzer support INSERT SELECT	2023-01-16 12:17:14 +01:00
Alexander Tokmakov	94604f71b7	Merge pull request #44922 from azat/dist/async-INSERT-metrics Optimize and fix metrics for Distributed async INSERT	2023-01-16 14:12:56 +03:00
Alexander Tokmakov	9ad6e1b129	Update logger_useful.h	2023-01-16 14:09:55 +03:00
Maksim Kita	cd2d794c99	Merge branch 'master' into 42648_Support_scalar_subqueries_cache	2023-01-16 13:49:43 +03:00
Maksim Kita	80f6a45376	Merge pull request #44641 from ClickHouse/vdimir/view_explain_2 Function viewExplain accept SELECT and settings	2023-01-16 13:39:53 +03:00
Vitaly Baranov	7030b64096	Fix build.	2023-01-16 10:46:58 +01:00
Han Fei	3481b4d50a	fix style	2023-01-16 10:41:35 +01:00
Vitaly Baranov	16a20cd06e	Use `std::string_view` instead of `const std::string_view &`	2023-01-16 10:18:04 +01:00
Maksim Kita	8f5250e000	Revert "Validate function arguments in query tree"	2023-01-16 10:14:34 +01:00
Vitaly Baranov	e435edb4ab	Make checkObjectExists() easier.	2023-01-16 10:06:20 +01:00
Maksim Kita	60d2a0bf7f	Merge pull request #44882 from ClickHouse/function-node-validation Validate function arguments in query tree	2023-01-16 11:31:02 +03:00
Robert Schulze	099e30ef2a	Merge remote-tracking branch 'origin/master' into query-result-cache	2023-01-16 08:04:49 +00:00
Robert Schulze	76d1fe08f9	Merge pull request #45252 from ClickHouse/block-nonfloat-gorilla Disallow Gorilla codec on non-float columns	2023-01-16 08:55:50 +01:00
Robert Schulze	ff493c439c	Merge pull request #45244 from bigo-sg/improve_like Add fast path for col like '%%' or col like '%' or match(col, '.*')	2023-01-16 08:36:20 +01:00
taiyang-li	2f7ea79d94	change as request	2023-01-16 10:42:58 +08:00
simpleton	1cdd7361b0	Merge branch 'ClickHouse:master' into master	2023-01-16 09:36:38 +08:00
Han Fei	5617f7f616	address comments	2023-01-15 22:51:10 +01:00
Vitaly Baranov	a955504043	Move `throw_on_error` parameter to the end.	2023-01-15 20:28:16 +01:00
Vitaly Baranov	21b8aaeb8b	Stop using HeadObject requests in S3 because they don't work well with endpoints without explicit region.	2023-01-15 20:28:11 +01:00
Han Fei	701dc88d6f	Merge branch 'master' into hanfei/support-advance-dedup	2023-01-15 19:46:28 +01:00
Han Fei	c859f8dbe5	Update src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp Co-authored-by: alesapin <alesapin@gmail.com>	2023-01-15 19:46:16 +01:00
Han Fei	bb2c0914e9	Update src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp Co-authored-by: alesapin <alesapin@gmail.com>	2023-01-15 19:46:09 +01:00
Robert Schulze	27fe7ebd93	Cosmetics	2023-01-15 16:12:48 +00:00
Robert Schulze	b6f12f9edd	Fix unit_tests_dbms	2023-01-15 14:49:04 +00:00
Robert Schulze	bd41c74ddf	Various test, code and docs fixups	2023-01-15 13:47:34 +00:00
Nikolay Degterinsky	24b686734d	Fix build	2023-01-15 13:46:55 +00:00
Robert Schulze	eac9a5728d	Merge remote-tracking branch 'origin/master' into block-nonfloat-gorilla	2023-01-15 13:35:41 +00:00
Ilya Yatsishin	96987b7cd8	Merge pull request #45239 from Avogar/generate-random	2023-01-15 00:37:34 +01:00
Robert Schulze	a4a6126c9d	Prohibit manual delta compression before floating-point time series compression	2023-01-14 20:09:50 +00:00
Robert Schulze	fbdaca4e2a	Code cleanup	2023-01-14 19:21:30 +00:00
flynn	29eb30b49f	Fix some reading avro format bugs fix	2023-01-14 18:05:26 +00:00
Dmitry Novik	3d23654720	Skip validation of function IN	2023-01-13 23:10:16 +00:00
Alexander Tokmakov	d857d62a03	remove another set of macros	2023-01-13 20:34:31 +01:00
Alexander Tokmakov	2d7773fccc	Merge branch 'master' into text_log_add_pattern	2023-01-13 20:33:46 +01:00
Han Fei	ed49ebf01a	update setting explain	2023-01-13 20:26:08 +01:00
Han Fei	2fb2f503e3	Update src/Storages/MergeTree/MergeTreeSettings.h Co-authored-by: Sergei Trifonov <svtrifonov@gmail.com>	2023-01-13 20:20:08 +01:00
Han Fei	bcf813fedc	Update src/Storages/StorageReplicatedMergeTree.cpp Co-authored-by: Sergei Trifonov <svtrifonov@gmail.com>	2023-01-13 20:19:30 +01:00
Han Fei	9e99c7e116	Update src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp Co-authored-by: Sergei Trifonov <svtrifonov@gmail.com>	2023-01-13 20:19:13 +01:00
Han Fei	a258a39eb1	Merge branch 'master' into hanfei/async-cache	2023-01-13 20:17:58 +01:00
Nikolay Degterinsky	36c20bf293	Merge remote-tracking branch 'upstream/master' into better_asterisk_parser	2023-01-13 19:15:55 +00:00
Anton Popov	487de70d01	fix locking at loading outdated data parts	2023-01-13 17:05:32 +00:00
avogar	e2470dd670	Fix tests	2023-01-13 17:03:53 +00:00
Robert Schulze	5d3f0ec4a0	Disallow Gorilla codec on non-float columns Cf. #45195	2023-01-13 16:53:28 +00:00
avogar	6cb7c4d175	Better commit, mark noexcept	2023-01-13 16:33:11 +00:00
avogar	76c89c6d20	Fix heap-use-after-free in reading from s3	2023-01-13 16:31:30 +00:00
Smita Kulkarni	d132d30707	Addressed review comments - 42648 Support scalar subqueries cache	2023-01-13 17:28:35 +01:00
Alexander Tokmakov	6de4837580	fix	2023-01-13 16:07:20 +01:00
Maksim Kita	dc24d831cf	Merge pull request #42970 from ClickHouse/optimize-redundant-function Implement optimize_redundant_functions_in_order_by on top of QueryTree.	2023-01-13 17:36:56 +03:00
Maksim Kita	05b1b78104	Merge pull request #44013 from kitaisreal/analyzer-aggregate-functions-passes-small-fixes Analyzer aggregate functions passes small fixes	2023-01-13 17:31:53 +03:00
avogar	abfb6b096f	Better exception message	2023-01-13 14:23:30 +00:00
Robert Schulze	4ea836b87e	Revert "Revert "update function DAYOFWEEK and add new function WEEKDAY for mysql/spark compatiability"" This reverts commit `e37f572c34`.	2023-01-13 14:00:16 +00:00
Smita Kulkarni	a0fe26f506	Addressed review comments and updated name to ServerStartupMilliseconds - Record server startup time in ProfileEvents	2023-01-13 14:38:54 +01:00
Alexander Tokmakov	9d5ec474a3	Merge pull request #43998 from evillique/make_system_replicas_parallel Make `system.replicas` parallel	2023-01-13 16:33:36 +03:00
Alexander Tokmakov	b88aae9d5c	Merge branch 'master' into fix_44496	2023-01-13 14:05:57 +01:00
Smita Kulkarni	cf5cb0da97	Record server startup time in ProfileEvents Implementation: * Added ProfileEvents::ServerStartupTime. * Recorded time from start of main till listening to sockets. Testing: * Added a test 02532_profileevents_server_startup_time.sql	2023-01-13 13:47:54 +01:00
Azat Khuzhin	64e3677961	Avoid double hash calculation in HashedDictionary::getShard(StringRef) Previously it was written this way because getShard() was a simple module operation. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	2783850f08	Minor review fixes in HashedDictionary Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	6e0a7add93	Completelly exception safe HashedDictionary dtor Previously there was one (even though very unlikely) case when the dtor can throw - logging code or ThreadPool::wait. Just guard the dtor with try/catch and done with it. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	74def83c5d	Destroy hashtables for hashed dictionary in parallel only for sharded dict Since there can be multiple hashtables, since each attribute uses it's own hashtable. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	1c0e0ea1e4	Disable sharded dictionaries with updatable sources Support of sharded dictionary for updatable sources is questionable since: - sharded dictionary developed for hashed dictionary with a huge number of keys - updatable source requires storing the whole table in memory (due to how reload works) - also it is an open question will it have some benefits from the updatable source or not, since using updatable source with a huge number of changes in the source does not looks optimal and on the other side if there are small amount of changes the you don't need sharded dictionary at all Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	c97991fce1	Use shared arena for HashedDictionary::blockToAttributes() This should decrease number of allocations. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	01b100da61	Use shared arena in ParallelDictionaryLoader::createShardSelector() (and add missing rollback) This should decrease number of allocations. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	64874824b4	Minor review fixes in HashedDictionary Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	77c1f07636	Make HashedDictionary::~HashedDictionary exception safe Before it was possible for the desturctor to throw, in case of thread allocation fails, rewrite it to trySchedule() and do sequential destroy in this case. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	a3f189e191	Optimize sharded dictionaries with skewed distribution In case of skewed distribution simple division by module will not give you good distribution between shards and eventually this can lead to performance the same as non-sharded dictionary (except for it will occupy +1 thread for Block::scatter). But if HashedDictionary::blockToAttributes() will not have calls to HashedDictionary::getShard() this can be fixed by using a more complex key-to-shard (getShard()) mapping. And actually you do not need to call getShard() in blockToAttributes() you can simply use passed shard, and that's it. And by wrapping key with intHash64() in getShard() skewed distribution can be fixed. Note, that previously I tried similar approach but did not removed getShard() from blockToAttributes(), that's why it failed. And now it works almost as fast as with simple createBlockSelector(), just 13.6% slower (18.75min vs 16.5min, with 16 threads). Note, that I've also tried to add libdivide for this, but it does not improves the performance. I've also tried the approach without scatter, and it works 20% slower then this one (22.5min VS 18.75min, with 16 threads). v2: Use intHashCRC32() over intHash64() for HashedDictionary::getShard() (with intHash64() it works very slower, almost 2x slower, there was 18min with 32 threads) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	655a564280	Parallel hash tables destroy for hashed dictionaries Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	99063b152f	Allow to configure queue backlog of the parallel hashed dictionary loader v2: Decrease default parallel_queue_backlog to 10000 (same speed) v3: Rename parallel_queue_backlog to per_shard_load_backlog v3: Rename per_shard_load_backlog to shard_load_queue_backlog v4: Fix documentation Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	79ad81dfdf	Implement separate queue for parallel loader of hashed dictionaries Previous patches in this series has a bottleneck in rehash(). This is the most slowest operation when insert lots of rows into the hashtable and eventually all that thread pool sometimes work as the most slowest thread since we did not have any queue of blocks. This patch adds such queue and now it scales linearly, so initialy with 1 thread I had ~4 hours for 10e9 elements (UInt64 key, UInt16 value), after this patch it works in 16 minutes with 16 threads (well actually I have to use 32 threads because of distribution of data in the source table). And now with 16 threads it works 16 times faster. Also this patch adds more optimal block splitting for the non-complex dictionaries, and usual block splitting for complex dictionaries. But anyway this moves the overhead from the loading into the hashtable threads out to the reader thread, and this is better, since reader does not uses that much CPU. v2: fix use-after-free on failed load (add missing wait in dtor) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	5d0fd3cdc4	Remove sharded overhead for non-sharded hashed dictionaries By adding one more template parameter - HashedDictionary<sharded> (yes, it is already too much of them, for the template class that has explicit instantion). Since perf tests [1] shows 20% slowdown. [1]: https://s3.amazonaws.com/clickhouse-test-reports/40003/8f0cf2d6b8a7df511afe901331d5e2c7b06c0b4d/performance_comparison_[1/4]/report.html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	345c422e28	Add ability to load hashed dictionaries using multiple threads Right now dictionaries (here I will talk about only HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED) can load data only in one thread, since it uses one hash table that cannot be filled from multiple threads. And in case you have very big dictionary (i.e. 10e9 elements), it can take a awhile to load them, especially for SPARSE_HASHED variants (and if you have such amount of elements there, you are likely use SPARSE_HASHED, since it requires less memory), in my env it takes ~4 hours, which is enormous amount of time. So this patch add support of shards for dictionaries, number of shards determine how much hash tables will use this dictionary, also, and which is more important, how much threads it can use to load the data. And with 16 threads this works 2x faster, not perfect though, see the follow up patches in this series. v0: PARTITION BY v1: SHARDS 1 v2: SHARDS(1) v3: tried optimized mod - logical and, but it does not gain even 10% v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either v5: move SHARDS into layout parameters (unknown simply ignored) v6: tune params for perf tests (to avoid too long queries) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:25 +01:00
Vladimir C	eefbffcc5b	Merge pull request #45230 from ClickHouse/vdimir/semi_join_null_const_bug	2023-01-13 13:22:57 +01:00
Anton Popov	71188c22ee	fix race on 'relative_data_path'	2023-01-13 12:19:41 +00:00
vdimir	f881a82417	Fix viewExplain, add testcases	2023-01-13 12:19:25 +00:00
vdimir	bdb9222736	Support EXPLAIN SYNTAX oneline = 1	2023-01-13 12:18:58 +00:00
Alexander Tokmakov	51d94314d6	Merge pull request #45235 from ClickHouse/more_verbose_logs_about_replication_log_entries More verbose logs about replication log entries	2023-01-13 15:05:21 +03:00
Maksim Kita	44f4184e11	Merge pull request #44540 from kitaisreal/analyzer-support-distributed Analyzer support distributed queries processing	2023-01-13 14:45:36 +03:00
Vitaly Baranov	00908dcc6c	Fix http requests without path for AWS. (#45238 )	2023-01-13 12:35:39 +01:00
Nikolai Kochetov	6e9dd2af45	Merge pull request #42889 from guowangy/logical-optimizer-lowcardinality Enable logical optimizer for LowCardinality regardless of short chain	2023-01-13 12:28:57 +01:00
vdimir	023162df1d	fix clang-tidy style	2023-01-13 11:25:07 +00:00
Robert Schulze	d7d3f61c73	Cleanup SourceFromChunks a bit	2023-01-13 10:57:31 +00:00
Robert Schulze	88df1df3e6	Fix Darwin build	2023-01-13 10:26:49 +00:00
Robert Schulze	9779d034eb	Merge pull request #45144 from ClibMouse/crc-power-fix Changes to support the CRC32 in PowerPC.	2023-01-13 11:24:18 +01:00
Maksim Kita	296dc5006d	Fixed tests	2023-01-13 10:59:26 +01:00
simpleton	45842da72e	Merge branch 'master' into master	2023-01-13 17:42:36 +08:00
Alexander Gololobov	d850225f6b	Merge pull request #45229 from CurtizJ/fix-rare-logical-error Fix rare logical error: `Too large alignment`	2023-01-13 09:48:28 +01:00
Antonio Andelic	99548c8c15	Merge branch 'master' into fix-crash-kv-store	2023-01-13 08:42:08 +00:00
taiyang-li	de5474c9f9	optimize match(a, '.*')	2023-01-13 14:55:54 +08:00
taiyang-li	45df745011	add fast path for like '%%' or like '%'	2023-01-13 12:20:03 +08:00
Robert Schulze	15e11741cb	Cosmetics	2023-01-13 00:00:23 +00:00
Ilya Yatsishin	ba05646dff	Merge pull request #45222 from ClickHouse/fuzz_prewhere Fuzz PREWHERE clause	2023-01-13 00:45:21 +01:00

... 4 5 6 7 8 ...

36482 Commits