Commit Graph

26965 Commits

Author SHA1 Message Date
Han Fei
42fca8d5f0 address comments 2022-05-31 00:17:32 +08:00
Han Fei
a464b10afe
Update src/Storages/System/StorageSystemZooKeeper.cpp
Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>
2022-05-30 23:52:37 +08:00
Han Fei
194445646a
Update src/Storages/System/StorageSystemZooKeeper.cpp
Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>
2022-05-30 23:47:43 +08:00
vdimir
8a3f4bda62
Fix columns number mismatch in cross join 2022-05-30 15:40:15 +00:00
Kseniia Sumarokova
b1ba7b7027
Fix build 2022-05-30 17:30:59 +02:00
Kseniia Sumarokova
1869adfd7d
Update FileCache.cpp 2022-05-30 16:06:58 +02:00
Nikolai Kochetov
c71256ea38 Remove some commented code. 2022-05-30 13:18:20 +00:00
Nikolai Kochetov
5ef51ed27b Fix more tests. 2022-05-30 13:10:30 +00:00
Amos Bird
b68e8efaf0
Fix joinGet with cannot be null type 2022-05-30 21:01:27 +08:00
Alexander Tokmakov
351956d108
Merge pull request #37640 from azat/transaction-fix
Fix excessive LIST requests to coordinator for transactions
2022-05-30 15:46:52 +03:00
Kruglov Pavel
0615866aea
Merge pull request #37450 from Avogar/check-format-on-storage-creation
Check format name on storage creation
2022-05-30 14:23:20 +02:00
avogar
139a7e19a9 Fix comments 2022-05-30 11:43:29 +00:00
alesapin
87baabb1a8 Followup fix 2022-05-30 13:34:42 +02:00
alesapin
362fa745e6 Ignore broken metadata 2022-05-30 13:32:12 +02:00
Nikolai Kochetov
5b4658aa5e Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-30 09:47:35 +00:00
Amos Bird
6525bfc4cd
Avoid context copy for InterpreterSelects 2022-05-30 17:08:12 +08:00
Azat Khuzhin
d86181d3cd keeper: store only unique session IDs for watches
This should speed up keeper, especially in case of incorrect usage (like
the case that had been fixed in #37640), especially in case on non
release build.

And also this should fix SIGKILL in stress tests.

You will find some details for one of such SIGKILL in `<details>` tag [1]:

<details>

    $ pigz -cd clickhouse-server.stress.log.gz | tail
    2022.05.27 16:17:24.882971 [ 637 ] {} <Trace> BackgroundSchedulePool/BgSchPool: Waiting for threads to finish.
    2022.05.27 16:17:24.896749 [ 637 ] {} <Debug> MemoryTracker: Peak memory usage (for query): 4.09 MiB.
    2022.05.27 16:17:24.907163 [ 637 ] {} <Debug> Application: Shut down storages.
    2022.05.27 16:17:24.907233 [ 637 ] {} <Debug> Application: Waiting for current connections to servers for tables to finish.
    2022.05.27 16:17:24.934335 [ 637 ] {} <Information> Application: Closed all listening sockets. Waiting for 1 outstanding connections.
    2022.05.27 16:17:29.843491 [ 637 ] {} <Information> Application: Closed connections to servers for tables. But 1 remain. Probably some tables of other users cannot finish their connections after context shutdown.
    2022.05.27 16:17:29.843632 [ 637 ] {} <Debug> KeeperDispatcher: Shutting down storage dispatcher
    2022.05.27 16:17:34.612616 [ 688 ] {} <Test> virtual Coordination::ZooKeeperRequest::~ZooKeeperRequest(): Processing of request xid=2147483647 took 10000 ms
    2022.05.27 16:17:54.612109 [ 3176 ] {} <Debug> KeeperTCPHandler: Session #12 expired
    2022.05.27 16:19:59.823038 [ 635 ] {} <Fatal> Application: Child process was terminated by signal 9 (KILL). If it is not done by 'forcestop' command or manually, the possible cause is OOM Killer (see 'dmesg' and look at the '/var/log/kern.log' for the details).

    Thread 26 (Thread 0x7f1c7703f700 (LWP 708)):
    0  0x000000000b074b2a in __tsan::MemoryAccessImpl(__tsan::ThreadState*, unsigned long, int, bool, bool, unsigned long long*, __tsan::Shadow) ()
    1  0x000000000b08630c in __tsan::MemoryAccessRange(__tsan::ThreadState*, unsigned long, unsigned long, unsigned long, bool) ()
    2  0x000000000b01ff03 in memmove ()
    3  0x000000001bbc8996 in std::__1::__move<long, long> (__first=0xb8600000d83304, __last=<optimized out>, __result=0x7f1c021cd000) at ../contrib/libcxx/include/__algorithm/move.h:57
    4  std::__1::move<long*, long*> (__first=0xb8600000d83304, __last=<optimized out>, __result=0x7f1c021cd000) at ../contrib/libcxx/include/__algorithm/move.h:70
    5  std::__1::vector<long, std::__1::allocator<long> >::erase (this=0x7b1400584c48, __position=...) at ../contrib/libcxx/include/vector:1608
    6  DB::KeeperStorage::clearDeadWatches (this=0x7b5800001ad8, this@entry=0x7b5800001800, session_id=session_id@entry=12) at ../src/Coordination/KeeperStorage.cpp:1228
    7  0x000000001bbc5c55 in DB::KeeperStorage::processRequest (this=0x7b5800001800, zk_request=..., session_id=12, time=1, new_last_zxid=..., check_acl=true) at ../src/Coordination/KeeperStorage.cpp:1122
    8  0x000000001bba06a3 in DB::KeeperStateMachine::commit (this=<optimized out>, log_idx=3549, data=...) at ../src/Coordination/KeeperStateMachine.cpp:143
    9  0x000000001bba6193 in nuraft::state_machine::commit_ext (this=0x7b4c00001f98, params=...) at ../contrib/NuRaft/include/libnuraft/state_machine.hxx:75
    10 0x00000000202c5a55 in nuraft::raft_server::commit_app_log (this=this@entry=0x7b6c00002a18, idx_to_commit=idx_to_commit@entry=3549, le=..., need_to_handle_commit_elem=true, initial_commit_exec=false) at ../contrib/NuRaft/src/handle_commit.cxx:311
    11 0x00000000202c4f98 in nuraft::raft_server::commit_in_bg_exec (this=<optimized out>, this@entry=0x7b6c00002a18, timeout_ms=timeout_ms@entry=0, initial_commit_exec=false) at ../contrib/NuRaft/src/handle_commit.cxx:241
    12 0x00000000202c4613 in nuraft::raft_server::commit_in_bg (this=this@entry=0x7b6c00002a18) at ../contrib/NuRaft/src/handle_commit.cxx:149
    ...
    Thread 28 (Thread 0x7f1c7603d700 (LWP 710)):
    0  0x00007f1d22a6d110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
    1  0x00007f1d22a650a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
    2  0x000000000b0337b0 in pthread_mutex_lock ()
    3  0x00000000221884da in std::__1::__libcpp_mutex_lock (__m=0x7b4c00002088) at ../contrib/libcxx/include/__threading_support:303
    4  std::__1::mutex::lock (this=0x7b4c00002088) at ../contrib/libcxx/src/mutex.cpp:33
    5  0x000000001bba4188 in std::__1::lock_guard<std::__1::mutex>::lock_guard (__m=..., this=<optimized out>) at ../contrib/libcxx/include/__mutex_base:91
    6  DB::KeeperStateMachine::getDeadSessions (this=0x7b4c00001f98) at ../src/Coordination/KeeperStateMachine.cpp:360
    7  0x000000001bb79b4b in DB::KeeperServer::getDeadSessions (this=0x7b4400012700) at ../src/Coordination/KeeperServer.cpp:572
    8  0x000000001bb64d1a in DB::KeeperDispatcher::sessionCleanerTask (this=<optimized out>, this@entry=0x7b640001c218) at ../src/Coordination/KeeperDispatcher.cpp:399
    ...
    Thread 1 (Thread 0x7f1d227148c0 (LWP 637)):
    0  0x00007f1d22a69376 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
    1  0x000000000b0895e0 in __tsan::call_pthread_cancel_with_cleanup(int (*)(void*), void (*)(void*), void*) ()
    2  0x000000000b017091 in pthread_cond_wait ()
    3  0x0000000020569d98 in Poco::EventImpl::waitImpl (this=0x7b2000008798) at ../contrib/poco/Foundation/src/Event_POSIX.cpp:106
    4  0x000000001bb636cf in Poco::Event::wait (this=0x7b2000008798) at ../contrib/poco/Foundation/include/Poco/Event.h:97
    5  ThreadFromGlobalPool::join (this=<optimized out>) at ../src/Common/ThreadPool.h:217
    6  DB::KeeperDispatcher::shutdown (this=0x7b640001c218) at ../src/Coordination/KeeperDispatcher.cpp:322
    7  0x0000000019ca8bfc in DB::Context::shutdownKeeperDispatcher (this=<optimized out>) at ../src/Interpreters/Context.cpp:2111
    8  0x000000000b0a979b in DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&)::$_9::operator()() const (this=0x7ffcde44f0a0) at ../programs/server/Server.cpp:1407

</details>

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/37593/2613149f6bf4f242bbbf2c3c8539b5176fd77286/stress_test__thread__actions_.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-30 11:50:48 +03:00
Azat Khuzhin
78bd47d8df Fix excessive LIST requests to coordinator for transactions
In [1] there was only few transactions, but lots of List for /test/clickhouse/txn/log:

    $ clickhouse-local --format TSVWithNamesAndTypes --file zookeeper_log.tsv.gz -q "select * except('path|session_id|event_time|thread_id|event_date|xid') apply(x->groupUniqArray(x)), path, session_id, min(event_time), max(event_time), count() from table where has_watch and type = 'Request' group by path, session_id order by count() desc limit 1 format Vertical"
    Row 1:
    ──────
    groupUniqArray(type):             ['Request']
    groupUniqArray(query_id):         ['','62d75128-9031-48a5-87ba-aec3f0b591c6']
    groupUniqArray(address):          ['::1']
    groupUniqArray(port):             [9181]
    groupUniqArray(has_watch):        [1]
    groupUniqArray(op_num):           ['List']
    groupUniqArray(data):             ['']
    groupUniqArray(is_ephemeral):     [0]
    groupUniqArray(is_sequential):    [0]
    groupUniqArray(version):          []
    groupUniqArray(requests_size):    [0]
    groupUniqArray(request_idx):      [0]
    groupUniqArray(error):            []
    groupUniqArray(watch_type):       []
    groupUniqArray(watch_state):      []
    groupUniqArray(stat_version):     [0]
    groupUniqArray(stat_cversion):    [0]
    groupUniqArray(stat_dataLength):  [0]
    groupUniqArray(stat_numChildren): [0]
    groupUniqArray(children):         [[]]
    path:                             /test/clickhouse/txn/log
    session_id:                       1
    min(event_time):                  2022-05-27 12:54:09.025897
    max(event_time):                  2022-05-27 13:37:12.846314 <!-- last transaction was at 12:54, see server log
    count():                          3673675 <-- huge

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/37593/2613149f6bf4f242bbbf2c3c8539b5176fd77286/stateless_tests__debug__actions__[1/3].html

Server log:

    $ pigz -cd clickhouse-server.log.gz | fgrep TransactionLog: | head -1
    2022.05.27 12:54:09.026852 [ 5018 ] {62d75128-9031-48a5-87ba-aec3f0b591c6} <Trace> TransactionLog: Loading 33 entries from /test/clickhouse/txn/log: csn-0000000000..csn-0000000032
    $ pigz -cd clickhouse-server.log.gz | fgrep TransactionLog: | tail -1
    2022.05.27 12:54:58.909222 [ 509 ] {} <Test> TransactionLog: Closing readonly transaction (177, 38, 41b51ff1-bcba-43bf-bcea-e97ad05f6040)
    $ pigz -cd clickhouse-server.log.gz | fgrep 62d75128-9031-48a5-87ba-aec3f0b591c6 | tail -1
    2022.05.27 12:54:09.064857 [ 5018 ] {62d75128-9031-48a5-87ba-aec3f0b591c6} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.

Fixes: #37398 (cc @tavplubix)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-30 11:39:34 +03:00
flynn
9ffa5f5e0d fix typo 2022-05-30 08:10:16 +00:00
Alexey Milovidov
1d6f9de001 Merge branch 'llvm-14' of github.com:ClickHouse/ClickHouse into llvm-14 2022-05-30 05:36:43 +02:00
Alexey Milovidov
f1fb57c6ce Fix clang-tidy-14 2022-05-30 05:36:26 +02:00
Alexey Milovidov
9a5dd75a68 Merge branch 'master' into llvm-14 2022-05-30 05:34:38 +02:00
Alexey Milovidov
c0e6ff4216 More precise result of "dumpColumnStructure" and "byteSize" miscellaneous functions 2022-05-30 04:56:54 +02:00
taiyang-li
0b0d38b18c Merge branch 'master' into fix_settings 2022-05-30 10:07:09 +08:00
alesapin
6f35c28592 Fix style 2022-05-30 00:29:30 +02:00
alesapin
c32b6076fb Remove stranges from code 2022-05-30 00:12:33 +02:00
Alexey Milovidov
5fda199dcf
Update src/Common/ErrorCodes.cpp
Co-authored-by: Alexander Tokmakov <tavplubix@clickhouse.com>
2022-05-30 00:20:49 +03:00
alesapin
0e8ab36913 Merge branch 'master' into turn_on_s3_tests 2022-05-29 14:37:10 +02:00
alesapin
d2cdbf3956 Fix refactoring issue 2022-05-29 14:09:49 +02:00
alesapin
9dc81e5cc8
Merge pull request #37598 from ClickHouse/revert-37545-revert-37424-fix_fetching_part_deadlock
Revert "Revert "(only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part""
2022-05-29 13:54:34 +02:00
Alexander Tokmakov
579b0e3323
Merge pull request #37627 from ClickHouse/revert-37416-fix_ReplicatedMergeTree_comments
Revert "Implemented changing comment to a ReplicatedMergeTree table"
2022-05-29 09:57:12 +03:00
Alexey Milovidov
9e3242f186
Merge pull request #37617 from CurtizJ/aggregation-sparse-columns
Better performance with sparse columns in aggregate functions
2022-05-29 09:36:07 +03:00
Alexander Tokmakov
562eec591e
Revert "Implemented changing comment to a ReplicatedMergeTree table" 2022-05-29 09:28:47 +03:00
Alexey Milovidov
97606c324c
Merge pull request #37574 from azat/mt-tiny-refactor
Remove unused MergeTreeDataMergerMutator::chooseMergeAlgorithm()
2022-05-29 07:59:57 +03:00
Alexey Milovidov
c1169019d2 Merge branch 'master' into llvm-14 2022-05-29 02:29:02 +02:00
Alexey Milovidov
11788c8129 Fix clang-tidy-14 2022-05-29 02:28:46 +02:00
Alexey Milovidov
73e2e63414
Merge pull request #37612 from ClickHouse/clang-tidy-14
Fix clang-tidy-14, part 1
2022-05-29 03:16:32 +03:00
Alexey Milovidov
4e60c88a27
Merge pull request #37609 from DevTeamBK/Medium-Clang-Tidy-Fix
Fix Clang-Tidy:  remove std::move() from trivially-copyable object
2022-05-28 21:27:08 +03:00
Anton Popov
c39d95e2e6 add perf test 2022-05-28 12:56:38 +00:00
Anton Popov
1d9b3be7da
Merge pull request #37536 from CurtizJ/profile-events-for-part-types
Add profile events for introspection of part types
2022-05-28 14:25:21 +02:00
Kseniia Sumarokova
8be351717f
Merge pull request #37606 from ClickHouse/kssenii-patch-3
Update FileCache.cpp
2022-05-28 12:48:45 +02:00
Han Fei
340a264a62 fix style 2022-05-28 18:26:14 +08:00
Han Fei
0a0d77bdef fix build 2022-05-28 17:57:59 +08:00
Han Fei
0f71231574 try fix flaky tests and refine code style 2022-05-28 17:25:33 +08:00
Vxider
b24346328d fix parser when using table identifer 2022-05-28 08:22:34 +00:00
Anton Popov
b2cff26ecf better performace with sparse columns in aggregate functions 2022-05-28 02:22:20 +00:00
Alexey Milovidov
eff285e24a
Update ThreadFuzzer.cpp 2022-05-28 05:13:16 +03:00
Alexander Gololobov
6a57e1a970
Merge pull request #37601 from ClickHouse/array_norm_dist_fixes
Added LpNorm and LpDistance functions for arrays
2022-05-27 23:40:38 +02:00
Alexey Milovidov
d6597efc08 Fix clang-tidy-14, part 1 2022-05-27 23:03:30 +02:00
Alexey Milovidov
be07c4c4b0 Fix clang-tidy-14, part 1 2022-05-27 23:03:16 +02:00
Alexey Milovidov
d62c57be3f Fix clang-tidy-14, part 1 2022-05-27 23:02:25 +02:00
Alexey Milovidov
8e9d771237 Fix clang-tidy-14, part 1 2022-05-27 23:02:05 +02:00
Alexey Milovidov
6c2699a991 Fix clang-tidy-14, part 1 2022-05-27 23:00:45 +02:00
Alexey Milovidov
3086c19341 Fix clang-tidy-14, part 1 2022-05-27 23:00:23 +02:00
Alexey Milovidov
c50791dd3b Fix clang-tidy-14, part 1 2022-05-27 22:52:14 +02:00
Alexey Milovidov
d2c6fd90cb Fix clang-tidy-14, part 1 2022-05-27 22:51:37 +02:00
Kseniia Sumarokova
10c9716467
Fix clang-tidy 2022-05-27 22:48:07 +02:00
Nikolai Kochetov
b80b1940ce Fix some tests. 2022-05-27 20:47:35 +00:00
HeenaBansal2009
a061acadbe Remove std::move from trivially-copyable object 2022-05-27 11:04:29 -07:00
Yakov Olkhovskiy
41ef0044f0 endpoint is added 2022-05-27 13:43:34 -04:00
mergify[bot]
f5ee337bab
Merge branch 'master' into revert-37545-revert-37424-fix_fetching_part_deadlock 2022-05-27 16:52:00 +00:00
mergify[bot]
923ad2e905
Merge branch 'master' into turn_on_s3_tests 2022-05-27 16:31:43 +00:00
Dmitry Novik
60b9d81773 Remove global_memory_usage_overcommit_max_wait_microseconds 2022-05-27 16:30:29 +00:00
alesapin
f63fa9bcc6
Merge pull request #37416 from Enmk/fix_ReplicatedMergeTree_comments
Implemented changing comment to a ReplicatedMergeTree table
2022-05-27 18:29:34 +02:00
Alexander Gololobov
9b1b30855c Fixed check for HUGE_VAL 2022-05-27 18:25:11 +02:00
Alexander Gololobov
6361c5f38c Fix for failed style check 2022-05-27 18:22:16 +02:00
Kseniia Sumarokova
8099361cbc
Update FileCache.cpp 2022-05-27 17:48:14 +02:00
Alexander Gololobov
540353566c Added LpNorm and LpDistance functions for arrays 2022-05-27 17:17:08 +02:00
Azat Khuzhin
8a224239c1 Prohibit optimize_aggregation_in_order with GROUPING SETS
AggregatingStep ignores it anyway, and it leads to the following error
in getSortDescriptionFromGroupBy(), like in [1]:

    2022.05.24 04:29:29.279431 [ 3395 ] {26543564-8bc8-4a3a-b984-70a2adf0245d} <Fatal> : Logical error: 'Trying to get name of not a column: ExpressionList'.

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/36914/67d3ac72d26ab74d69f03c03422349d4faae9e19/stateless_tests__ubsan__actions_.html

v2: revert change to getSortDescriptionFromGroupBy() after
    GroupingSetsRewriterVisitor had been introduced
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-27 17:44:57 +03:00
Azat Khuzhin
1f29b0a901 Rewrite queries GROUPING SETS (foo, bar) to GROUP BY foo, bar
This is better then introducing separate
SelectQueryExpressionAnalyzer::useGroupingSetKey(), since for
optimize_aggregation_in_order that method will not be enough, because
size of ManyExpressionActions will not match size of SortDescription, in
ReadInOrderOptimizer::ReadInOrderOptimizer()

And plus it is cleaner.

v2: fix clang-tidy
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-27 17:44:51 +03:00
alesapin
5a296aec01 Fix build 2022-05-27 16:34:16 +02:00
alesapin
be1c3c132b Fix some trash 2022-05-27 16:08:49 +02:00
Anton Popov
abc90fad8d fix WITH FILL with negative itervals 2022-05-27 12:42:51 +00:00
alesapin
6d6779f17a
Merge pull request #37139 from ClickHouse/i_object_storage
Separate object storage operations from disks
2022-05-27 13:59:50 +02:00
alesapin
c79600c4c8 Fix build 2022-05-27 13:44:29 +02:00
alesapin
841858ec30
Revert "Revert "(only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part"" 2022-05-27 13:13:36 +02:00
Azat Khuzhin
2613149f6b Fix converting types for UNION queries (may produce LOGICAL_ERROR)
CI founds [1]:

    2022.02.20 15:14:23.969247 [ 492 ] {} <Fatal> BaseDaemon: (version 22.3.1.1, build id: 6082C357CFA6FF99) (from thread 472) (query_id: a5187ff9-962a-4e7c-86f6-8d48850a47d6) (query: SELECT 0., round(avgWeighted(x, y)) FROM (SELECT toDate(toDate('214748364.8', '-922337203.6854775808', '-0.1', NULL) - NULL, 10.000100135803223, '-2147483647'), 255 AS x, -2147483647 AS y UNION ALL SELECT y, NULL AS x, 2147483646 AS y)) Received signal Aborted (6)

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/0/26d0e5438c86e52a145aaaf4cb523c399989a878/fuzzer_astfuzzerdebug,actions//report.html

The problem is that subqueries returns different headers:
- first query  -- x, y
- second query -- y, x

v2: Make order of columns strict only for UNION
    https://s3.amazonaws.com/clickhouse-test-reports/34775/9cc8c01a463d18c471853568b2f0af659a4e643f/stateless_tests__address__actions__[2/2].html
    Fixes: 00597_push_down_predicate_long
v3: add no-backward-compatibility-check for the test
Fixes: #37569
Resubmit: #34775
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
(cherry picked from commit a813f5996e)
2022-05-27 14:11:57 +03:00
Han Fei
2ea027ffcb Support insert into system.zookeeper 2022-05-27 18:53:12 +08:00
Kseniia Sumarokova
141334448e
Merge pull request #37566 from kssenii/fix-assertion
Fix failed assertion in cache
2022-05-27 11:59:03 +02:00
Kseniia Sumarokova
2943d44bf1
Merge pull request #37554 from msaf1980/cleanup_hdfs
Cleanup StorageHDFS (unused variables prevent build with clang 12)
2022-05-27 11:57:18 +02:00
Kseniia Sumarokova
f5d69506b4
Merge pull request #37516 from KinderRiven/improve_local_cache
Control cache downloads to avoid negative optimization of local caches
2022-05-27 11:53:17 +02:00
zhanglistar
ca67e67a74 Fix a typo 2022-05-27 15:52:04 +08:00
Robert Schulze
80061aa3e2
Merge remote-tracking branch 'origin/master' into cached_patterns 2022-05-27 09:21:01 +02:00
Vxider
54d6f98122 flush and shutdown temporary table before drop 2022-05-27 04:50:36 +00:00
Dmitry Novik
3a9239b79f
Revert "RFC: Fix converting types for UNION queries (may produce LOGICAL_ERROR)" 2022-05-27 04:05:32 +02:00
Yakov Olkhovskiy
25884c68f1 http named collection source implemented for dictionary 2022-05-26 20:46:26 -04:00
Alexey Milovidov
8ba865bb60
Merge pull request #37344 from excitoon-favorites/fixs3colonandequalssign
Fixed error with symbols in key name in S3
2022-05-27 00:58:35 +03:00
Alexey Milovidov
86afa3a245
Merge pull request #37502 from ClickHouse/array_norm_dist_fixes
Renamed arrayXXNorm/arrayXXDistance functions to XXNorm/XXDistance and fixed some overflow cases
2022-05-27 00:56:29 +03:00
Alexander Gololobov
e655863d53
Merge pull request #37528 from kitaisreal/normalize-utf8-performance-tests-fix
Functions normalizeUTF8 unstable performance tests fix
2022-05-26 20:49:10 +02:00
Azat Khuzhin
c6c60364ae Remove unused MergeTreeDataMergerMutator::chooseMergeAlgorithm()
In favor of
MergeTask::ExecuteAndFinalizeHorizontalPart::chooseMergeAlgorithm()

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-26 21:20:50 +03:00
Alexander Tokmakov
eb71dd4c78
Merge pull request #37547 from ClickHouse/followup_37398
Follow-up to #37398
2022-05-26 20:29:41 +03:00
Nikolai Kochetov
ec4e8d71b2 Fixing build 2022-05-26 15:33:21 +00:00
Dmitry Novik
673a521d0b
Merge pull request #34775 from azat/union-type-cast
RFC: Fix converting types for UNION queries (may produce LOGICAL_ERROR)
2022-05-26 17:28:23 +02:00
kssenii
36af6b1fa8 Fix assertion 2022-05-26 16:15:02 +02:00
alesapin
c862f89b8d Fix tidy 2022-05-26 15:43:21 +02:00
Alexander Tokmakov
e8f33fb0d9 fix flaky tests 2022-05-26 14:17:05 +02:00
Azat Khuzhin
dc9ca3d70c
Fix LOGICAL_ERROR in getMaxSourcePartsSizeForMerge during merges (#37413) 2022-05-26 14:14:58 +02:00
Nikolai Kochetov
bf95541531 Fixing style. 2022-05-26 11:09:36 +00:00
Nikolai Kochetov
84f97b53de Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-26 11:07:45 +00:00
Nikolai Kochetov
fea2401f1f
Merge pull request #37532 from ClickHouse/add-separate-mutex-for-factories-info
Use a separate mutex for query_factories_info in Context.
2022-05-26 13:03:28 +02:00
mergify[bot]
a7629f900f
Merge branch 'master' into normalize-utf8-performance-tests-fix 2022-05-26 10:29:55 +00:00
Maksim Kita
3a92e61827
Merge pull request #37148 from kitaisreal/dictionary-get-descendants-performance-improvement
Dictionary getDescendants performance improvement
2022-05-26 12:29:17 +02:00
KinderRiven
824628c0da fix style 2022-05-26 16:51:16 +08:00
KinderRiven
822ecd982f better & support clean stash 2022-05-26 16:36:05 +08:00
Vasily Nemkov
abe6b5d013
Reverted unnecessary modification 2022-05-26 10:09:27 +03:00
Antonio Andelic
fe236c98d5
Merge pull request #37534 from ClickHouse/revert-37036-keeper-preprocess-operations
Revert "Add support for preprocessing ZooKeeper operations in `clickhouse-keeper`"
2022-05-26 08:14:46 +02:00
Yakov Olkhovskiy
2dc160a4c3 style fix 2022-05-25 20:56:36 -04:00
Dmitry Novik
5c3c994d2a
Merge pull request #37493 from ClickHouse/grouping-sets-optimization-fix
Fix ORDER BY optimization in case of GROUPING SETS
2022-05-26 02:25:02 +02:00
Anton Popov
f488efd27e fix tests 2022-05-26 00:03:31 +00:00
Dmitry Novik
16c6b60703 Introduce AggregationKeysInfo 2022-05-25 23:22:29 +00:00
alesapin
8f1aac0ce4 Fix merge with master 2022-05-26 00:44:45 +02:00
Alexey Milovidov
f321925032
Merge pull request #36341 from ClickHouse/allow-setuid-inside-clickhouse
Allow to drop privileges at startup
2022-05-26 01:07:04 +03:00
Dmitry Novik
7cd7782e4f Process columns more efficiently in GROUPING() 2022-05-25 21:55:41 +00:00
Dmitry Novik
3c1b6609ae Add comments and make tests more verbose 2022-05-25 21:23:35 +00:00
mergify[bot]
49cce189e3
Merge branch 'master' into floating_seconds 2022-05-25 21:08:55 +00:00
alesapin
1db9cf480b Merge remote-tracking branch 'origin/master' into i_object_storage 2022-05-25 22:50:22 +02:00
Maksim Kita
58cd1bd3ec
Merge pull request #36843 from bharatnc/ncb/h3-unidirectionaledges-funcs
add h3 unidirectional edge functions
2022-05-25 22:46:40 +02:00
Maksim Kita
bee3c30f66
Merge pull request #37524 from kitaisreal/geo-distance-functions-improve-performance
Geo distance functions improve performance
2022-05-25 22:40:40 +02:00
Maksim Kita
b12b363158 Fixed build of hierarchical index for HashedArrayDictionary 2022-05-25 22:40:19 +02:00
Alexander Gololobov
168b47d0ad Use same norm and distance function names for tuples and arrays 2022-05-25 22:39:59 +02:00
Alexander Gololobov
b065839f44 always return Float64 2022-05-25 22:27:00 +02:00
Alexander Gololobov
5df14cd956 Cast arguments to result type to avoid int overflow 2022-05-25 22:27:00 +02:00
Alexander Tokmakov
47820c216d
Revert "(only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part" 2022-05-25 23:10:33 +03:00
Robert Schulze
49934a3dc8
Cache compiled regexps when evaluating non-const needles
Needles in a (non-const) needle column may repeat and this commit allows
to skip compilation for known needles. Out of the different design
alternatives (see below, if someone is interested), we now maintain
- one global pattern cache,
- with a fixed size of 42k elements currently,
- and use LRU as eviction strategy.

------------------------------------------------------------------------

(sorry for the wall of text, dumping it here not for reading but just
for reference)

Write-up about considered design alternatives:

1. Keep the current global cache of const needles. For non-const
   needles, probe the cache but don't store values in it.
   Pros: need to maintain just a single cache, no problem with cache
         pollution assuming there are few distinct constant needles
   Cons: only useful if a non-const needle occurred as already as a
         const needle
   --> overall too simplistic

2. Keep the current global cache for const needles. For non-const
   needles, create a local (e.g. per-query) cache
   Pros: unlike (1.), non-const needles can be skipped even if they
         did not occur yet, no pollution of the const pattern cache when
         there are very many non-const needles (e.g. large / highly
         distinct needle columns).
   Cons: caches may explode "horizontally", i.e. we'll end up with the
         const cache + caches for Q1, Q2, ... QN, this makes it harder
         to control the overall space consumption, also patterns
         residing in different caches cannot be reused between queries,
         another difficulty is that the concept of "query" does not
         really exist at matching level - there are only column chunks
         and we'd potentially end up with 1 cache / chunk

3. Queries with const and non-const needles insert into the same global
   cache.
   Pros: the advantages of (2.) + allows to reuse compiled patterns
         accross parallel queries
   Cons: needs an eviction strategy to control cache size and pollution
         (and btw. (2.) also needs eviction strategies for the
         individual caches)

4. Queries with const needle use global cache, queries with non-const
   needle use a different global cache
   --> Overall similar to (3) but ignores the (likely) edge case that
       const and non-const needles overlap.

In sum, (3.) seems the simplest and most beneficial approach.

Eviction strategies:

0. Don't ever evict --> cache may grow infinitely and eventually make
   the system unusable (may even pose a DoS risk)

1. Flush the cache after a certain threshold is exceeded --> very
   simple but may lead to peridic performance drops

2. Use LRU --> more graceful performance degradation at threshold but
   comes with a (constant) performance overhead to maintain the LRU
   queue

In sum, given that the pattern compilation in RE2 should be quite costly
(pattern-to-DFA/NFA), LRU may be acceptable.
2022-05-25 22:04:06 +02:00
Robert Schulze
ea60a614d2
Decrease namespace indent 2022-05-25 21:56:35 +02:00
alesapin
c7b16065e1 Merge with master 2022-05-25 21:47:05 +02:00
Nikolai Kochetov
6d4a26afac Update ReadProgressCallback. 2022-05-25 19:45:48 +00:00
Alexey Milovidov
abf2558fba
Merge pull request #37491 from ClickHouse/match_refactoring
Refactorings of LIKE/MATCH code
2022-05-25 22:05:38 +03:00
Alexey Milovidov
4482da9eb6
Update greatCircleDistance.cpp 2022-05-25 21:59:31 +03:00
alesapin
6f5c86e55e Merge branch 'master' into i_object_storage 2022-05-25 20:49:01 +02:00
Nikolai Kochetov
54d7e4139f Fix build. 2022-05-25 18:16:48 +00:00
alesapin
51868a9a4f
Merge pull request #37424 from metahys/fix_fetching_part_deadlock
(only with zero-copy replication, non-production experimental feature not recommended to use) fix possible deadlock during fetching part
2022-05-25 20:15:41 +02:00
Azat Khuzhin
a813f5996e Fix converting types for UNION queries (may produce LOGICAL_ERROR)
CI founds [1]:

    2022.02.20 15:14:23.969247 [ 492 ] {} <Fatal> BaseDaemon: (version 22.3.1.1, build id: 6082C357CFA6FF99) (from thread 472) (query_id: a5187ff9-962a-4e7c-86f6-8d48850a47d6) (query: SELECT 0., round(avgWeighted(x, y)) FROM (SELECT toDate(toDate('214748364.8', '-922337203.6854775808', '-0.1', NULL) - NULL, 10.000100135803223, '-2147483647'), 255 AS x, -2147483647 AS y UNION ALL SELECT y, NULL AS x, 2147483646 AS y)) Received signal Aborted (6)

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/0/26d0e5438c86e52a145aaaf4cb523c399989a878/fuzzer_astfuzzerdebug,actions//report.html

The problem is that subqueries returns different headers:
- first query  -- x, y
- second query -- y, x

v2: Make order of columns strict only for UNION
    https://s3.amazonaws.com/clickhouse-test-reports/34775/9cc8c01a463d18c471853568b2f0af659a4e643f/stateless_tests__address__actions__[2/2].html
    Fixes: 00597_push_down_predicate_long
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-25 20:31:47 +03:00
Nikolai Kochetov
ff98c24d44
Merge pull request #37048 from Avogar/fix-array-map-nothing
Add default implementation for Nothing in functions
2022-05-25 19:10:40 +02:00
Nikolai Kochetov
1f9b1cf726 Fixing build. 2022-05-25 18:59:46 +02:00
alesapin
0a3597da72
Merge pull request #34915 from ianton-ru/MDB-16962
Fix collision of S3 operation log revision
2022-05-25 18:15:31 +02:00
Yakov Olkhovskiy
6692b9c2ed showCertificate function implementation 2022-05-25 12:11:44 -04:00
Alexey Milovidov
cb92482ca5
Merge pull request #37484 from kitaisreal/function-has-all-avx2-dynamic-dispatch
Function hasAll added dynamic dispatch
2022-05-25 19:05:32 +03:00
Nikolai Kochetov
7b681fa8ac Fixing build. 2022-05-25 17:15:23 +02:00
avogar
ede6e2f433 Add docs for settings 2022-05-25 15:10:20 +00:00
avogar
4c9812d4c1 Allow to skip some of the first rows in CSV/TSV formats 2022-05-25 15:00:11 +00:00
KinderRiven
a33c7ce648 fix 2022-05-25 22:58:47 +08:00
Anton Popov
16e839ac71 add profile events for introspection of part types 2022-05-25 14:54:49 +00:00
Antonio Andelic
6a962549d5
Revert "Add support for preprocessing ZooKeeper operations in clickhouse-keeper" 2022-05-25 16:45:32 +02:00
Nikolai Kochetov
1b85f2c1d6 Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-25 16:27:40 +02:00
msaf1980
fda6ddeffa cleanup StorageHDFS (unused variables) 2022-05-25 19:23:05 +05:00
mergify[bot]
73662b4436
Merge branch 'master' into fix_fetching_part_deadlock 2022-05-25 14:22:35 +00:00
Maksim Kita
28355114c0 Fixed tests 2022-05-25 16:19:29 +02:00
Nikolai Kochetov
6370c29049 Use a separate mutex for query_factories_info in Context. 2022-05-25 14:16:59 +00:00
Maksim Kita
e67b3537f7 Functions normalizeUTF8 unstable performance tests fix 2022-05-25 15:54:52 +02:00
KinderRiven
875557abc2 fix 2022-05-25 21:53:28 +08:00
KinderRiven
adbb821176 fix 2022-05-25 21:05:15 +08:00
mergify[bot]
f49552d48e
Merge branch 'master' into grouping-sets-optimization-fix 2022-05-25 13:03:54 +00:00
Maksim Kita
45da28ecae Improve performance of geo distance functions 2022-05-25 14:22:22 +02:00
KinderRiven
2211c1ddb8 fix 2022-05-25 20:15:43 +08:00
Maksim Kita
83554d1f2d Fixed style 2022-05-25 13:05:39 +02:00
Maksim Kita
fbec38ddb9 Fixed performance tests 2022-05-25 12:51:21 +02:00
KinderRiven
d0fcffec66 fix style 2022-05-25 17:51:03 +08:00
Maksim Kita
c372c3d6aa Fix performance tests 2022-05-25 11:49:59 +02:00
Maksim Kita
9a9df26eec Fixed tests 2022-05-25 11:44:37 +02:00
Maksim Kita
6c033f340b Fixed tests 2022-05-25 11:44:37 +02:00
Maksim Kita
0e5f13e53e MergingSortedAlgorithm single column specialization 2022-05-25 11:44:37 +02:00
KinderRiven
1ce219bae2 fix 2022-05-25 17:24:38 +08:00
Kseniia Sumarokova
b50d4549c9
Merge pull request #37356 from amosbird/partition-prune-for-s3
"Partition pruning" for s3
2022-05-25 11:03:07 +02:00
KinderRiven
e3f76cab55 impl improve remote fs cache 2022-05-25 16:54:28 +08:00
avogar
f782fa31c6 Merge branch 'master' of github.com:ClickHouse/ClickHouse into check-format-on-storage-creation 2022-05-25 08:42:54 +00:00
Robert Schulze
05e4fa7df1
Fix special case of trivial regexp
Previously, we would alsays set 1 in case of a trivial regex (which is
correct). If someone in future builds a negated operator, then this
will produce wrong results. Right now, negation of regexp (SQL: NOT
MATCH) is implemented at a higher level, so we are safe and this is more
a preventive fix.
2022-05-25 10:05:55 +02:00
Robert Schulze
01ab7b9bad
Pass strings in some places as string_view
The original goal was to get change

  const auto & needle = String(
        reinterpret_cast<const char *>(cur_needle_data),
        cur_needle_length);

in Functions/MatchImpl.h into a std::string_view to save an allocation +
copy. The needle is eventually passed as search pattern into the re2
library. Re2 has an alternative constructor taking a const char * i.e. a
NULL-terminated string. Here, the needle is NULL-terminated but
1. this is only because it is passed inside a ColumnString yet this is
   not always the case (e.g. fixed string columns has a dense layout w/o
   NULL terminator).
2. assuming NULL termination for users != MatchImpl of the regex code is
   too dangerous.

So, for now we'll stay with copying to be on the safe side. One fine day
when re2 has a ptr/size ctor, we can use std::string_view.

Just changing a few other places from std::string to std::string_view
but this will not help with performance.
2022-05-25 10:05:51 +02:00
Robert Schulze
e8c96777f6
Make OptimizedRegularExpression::analyze() private 2022-05-25 10:05:45 +02:00
Robert Schulze
040fbf3686
Tighter sanity checks in matching code 2022-05-25 10:05:06 +02:00
Robert Schulze
35bef17302
Introduce variables to hold the match result
--> nicer when debugging
2022-05-25 10:04:47 +02:00
Robert Schulze
b044d44fef
Refactoring: Make template instantiation easier to read
- introduced class MatchTraits with enums that replace bool template
  parameters

- (minor: made negation the last template parameters because negation
  executes last during evaluation)
2022-05-25 10:03:58 +02:00
Bharat Nallan Chakravarthy
57cfc0bd04 check for validity of h3 index 2022-05-25 06:17:15 +05:30
Alexey Milovidov
516fba27dc Merge branch 'master' into allow-setuid-inside-clickhouse 2022-05-24 23:31:14 +02:00
Nikolai Kochetov
3d84aae0ab Better. 2022-05-24 20:06:08 +00:00
Maksim Kita
c1777aec1e
Merge pull request #37481 from kitaisreal/partial-sorting-transform-optimization-fix
Column compareImpl devirtualize compare call
2022-05-24 22:05:41 +02:00
Nikolai Kochetov
333fd09dbf Fixing build. 2022-05-24 19:29:00 +00:00
Igor Nikonov
04e2737a57
Merge pull request #37337 from ClickHouse/column_decl_null_before_default_value
Column declaration: [NOT] NULL right after type 

+ fixed: data_type_default_nullable=true, it didn't make columns nullable
         if the column declaration contains default expression w/o type

Issue #37229
2022-05-24 21:16:25 +02:00
Alexander Gololobov
2ff747785e
Merge pull request #37394 from ClickHouse/array_norm_dist_fixes
Do computations on the raw input data without copying to Eigen::Matrix
2022-05-24 20:59:04 +02:00
Dmitry Novik
c4dc0f7cda Fix ORDER BY optimization in case of GROUPING SETS 2022-05-24 18:56:22 +00:00
Maksim Kita
96833b8696 ColumnImpl compareImpl added assert for compare result 2022-05-24 20:41:48 +02:00
Robert Schulze
7348a0eb28
Merge pull request #37251 from ClickHouse/non_const_like
Support non-constant SQL functions (NOT) (I)LIKE and MATCH
2022-05-24 20:28:31 +02:00
Robert Schulze
028f15c4fa
Review comment: Throw LOGICAL_ERROR for different sizes of haystack / needles 2022-05-24 20:19:13 +02:00
Vasily Nemkov
59b4d4a643 ALTER COMMENT is now local-only operation and immediately observable 2022-05-24 21:08:30 +03:00
Kruglov Pavel
0c3bcfa122
Merge pull request #36884 from Algunenano/http_proper_summary_and_exception
HTTP: Always return summary data and exception (when possible)
2022-05-24 18:33:24 +02:00
Maksim Kita
3c0c322d7c
Merge pull request #37480 from kitaisreal/dynamic-dispatch-infrastructure-improvements
Dynamic dispatch infrastructure style fixes
2022-05-24 18:13:53 +02:00
Vitaly Baranov
497c70d786
Merge pull request #37269 from vitlibar/rework-access-control-notifications
Rework AccessControl's notifications.
2022-05-24 17:47:45 +02:00
Raúl Marín
2f37ad7fb8 Improve comment 2022-05-24 17:11:15 +02:00
Raúl Marín
a86aa43baa Merge remote-tracking branch 'blessed/master' into floating_seconds_2 2022-05-24 17:10:45 +02:00
Maksim Kita
6fb51e8bd3 Function hasAll added dynamic dispatch 2022-05-24 17:06:06 +02:00
Vladimir C
0eaf22b370
Merge pull request #37453 from vdimir/join_auto_lc_nullable_bug 2022-05-24 16:28:23 +02:00
Vladimir C
bec4ae87c9
Merge pull request #37472 from amosbird/joinpushdown 2022-05-24 16:08:26 +02:00
Maksim Kita
86180614e7 Fixed tests 2022-05-24 15:33:03 +02:00
Anton Popov
e96af9fd75 better binary serialization of ColumnObject 2022-05-24 13:16:11 +00:00
alesapin
9a19309e69 Slightly better fix 2022-05-24 14:46:29 +02:00
Maksim Kita
bdc537ead3 Column compareImpl devirtualize compare call 2022-05-24 14:28:33 +02:00
Alexander Tokmakov
229d35408b
Merge pull request #37398 from ClickHouse/fixes_for_transactions
Fixes for transactions
2022-05-24 15:28:01 +03:00
alesapin
3ca7a8831b Revert "fix deadlock during fetching part"
This reverts commit 6ae8a26fae.
2022-05-24 14:26:06 +02:00
Maksim Kita
e6e4b2826d Dynamic dispatch infrastructure style fixes 2022-05-24 14:25:29 +02:00
alesapin
a5ba6bca95 Merge branch 'master' into metahys-fix_fetching_part_deadlock 2022-05-24 14:07:47 +02:00