Commit Graph

167 Commits

Author SHA1 Message Date
Alexey Milovidov
930a67da13 Fix the annoying Arcadia 2021-06-27 02:49:25 +03:00
tavplubix
b9437fa6f6
Update arcadia_skip_list.txt 2021-06-25 00:04:31 +03:00
Alexey Milovidov
e8200f5cef Add a test for #22108 2021-06-23 00:21:32 +03:00
Anton Popov
cbc7e61140
Update arcadia_skip_list.txt 2021-06-22 02:23:53 +03:00
Anton Popov
d8b6f15ef4
Merge pull request #23027 from azat/distributed-push-down-limit
Add ability to push down LIMIT for distributed queries
2021-06-20 23:08:50 +03:00
Vladimir
6af272891a
Merge pull request #24870 from azat/mv-dist-join-fix 2021-06-18 11:18:43 +03:00
Amos Bird
c8ea6527cb
Add prefer_global_in_and_join setting 2021-06-17 14:27:29 +08:00
alexey-milovidov
b298047527
Update arcadia_skip_list.txt 2021-06-15 19:29:20 +03:00
Alexey Milovidov
d0aa387e64 Fixed the annoying Arcadia, it is bogging my weekend. 2021-06-13 23:53:49 +03:00
alexey-milovidov
768b8fed75
Merge branch 'master' into mv-dist-join-fix 2021-06-12 03:19:09 +03:00
Azat Khuzhin
18e8f0eb5e Add ability to push down LIMIT for distributed queries
This way the remote nodes will not need to send all the rows, so this
will decrease network io and also this will make queries w/
optimize_aggregation_in_order=1/LIMIT X and w/o ORDER BY faster since it
initiator will not need to read all the rows, only first X (but note
that for this you need to your data to be sharded correctly or you may
get inaccurate results).

Note, that having lots of processing stages will increase the complexity
of interpreter (it is already not that clean and simple right now).

Although using separate QueryProcessingStage looks pretty natural.

Another option is to make WithMergeableStateAfterAggregation always, but
in this case you will not be able to disable only this optimization,
i.e. if there will be some issue with it.

v2: fix OFFSET
v3: convert 01814_distributed_push_down_limit test to .sh and add retries
v4: add test with OFFSET
v5: add new query stage into the bash completion
v6/tests: use LIMIT O,L syntax over LIMIT L OFFSET O since it is broken in ANTLR parser
          https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_(antlr_debug).html#fail1
v7/tests: set use_hedged_requests to 0, to avoid excessive log entries on retries
          https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_flaky_check_(address).html#fail1
2021-06-09 02:29:50 +03:00
Azat Khuzhin
8164c49f04 Fix limit/offset settings for distributed queries (ignore on the remote nodes) 2021-06-03 21:14:24 +03:00
Azat Khuzhin
25f3efde2b Add a test for Materialized(Distributed()) with JOIN and GROUP BY 2021-06-03 21:07:47 +03:00
Maksim Kita
15da5eb88d
Merge branch 'master' into nv/scalar-subquery-exception 2021-05-28 21:38:27 +03:00
Maksim Kita
269749381e
Update arcadia_skip_list.txt 2021-05-28 20:36:35 +03:00
Nikita Mikhaylov
bb064ee3dc
Update arcadia_skip_list.txt 2021-05-27 14:03:35 +03:00
Nikita Mikhaylov
1d79c10168
Merge branch 'master' into fix-arcadia-8 2021-05-26 15:25:51 +03:00
Nikita Mikhaylov
0e89be382b fix 2021-05-26 15:24:57 +03:00
Azat Khuzhin
fbe8b37ff6 Skip 01880_remote_ipv6 in arcadia 2021-05-21 22:47:55 +03:00
Nikita Mikhaylov
703bd2aeb6 done 2021-05-15 00:25:12 +03:00
pingyu
46f809d07f Revert "Merge pull request #23334 from ClickHouse/revert-22609-datasketches-uniq"
This reverts commit af2499359b, reversing
changes made to db82e9e3d5.
2021-05-05 16:42:57 +08:00
Azat Khuzhin
73ab415c4c Preserve errors for INSERT into Distributed
Before this patch (and after #22208) the INSERT may fail with "Cannot
schedule a task" because the pool in DistributedBlockOutputStream
already throws exception and simply fail in writeSuffix().
2021-04-28 22:33:29 +03:00
Alexey Milovidov
a8da1b5e40 The annoying Arcadia 2021-04-21 14:59:49 +03:00
alexey-milovidov
62899436db
Revert "add uniqThetaSketch" 2021-04-20 03:34:21 +03:00
Kruglov Pavel
995973bf1f
Merge pull request #22609 from pingyu/datasketches-uniq
add uniqThetaSketch
2021-04-19 10:32:29 +03:00
Pavel Kruglov
ec83aed3eb Move long queries from 01798_uniq_theta_sketch to separate test 2021-04-16 11:44:32 +03:00
tavplubix
2479c80fb7
Merge branch 'master' into fix_attach_mv 2021-04-15 00:07:48 +03:00
tavplubix
caf97d297b
Update arcadia_skip_list.txt 2021-04-14 12:17:41 +03:00
Nikita Mikhaylov
efef179b89 remove prints 2021-04-13 22:39:42 +03:00
Kruglov Pavel
6350f734dc
Merge branch 'master' into datasketches-uniq 2021-04-13 19:34:15 +03:00
Pavel Kruglov
6ce9873a76 Move all uniqThetaSketch tests into one file, add it in fasttest and arcadia skip lists 2021-04-13 19:33:15 +03:00
Maksim Kita
95d3571ab9 Added test to skiplist 2021-04-11 12:03:55 +03:00
Alexey Milovidov
b5ff97dbd6 Add a test for already fixed issue 2021-04-08 15:29:34 +03:00
Azat Khuzhin
27d4fbd13b Compare Block itself for distributed async INSERT batches
INSERT into Distributed with insert_distributed_sync=1 stores the
distributed batches on the disk for sending in background.

But types may be a little bit different for the Distributed and it's
underlying table, so the initiator need to know whether conversion is
required or not.

Before this patch those on disk distributed batches contains header,
which includes dumpStructure() for the block in that batch, however it
checks not only names and types and plus dumpStructure() is a debug
method.

So instead of storing string representation for the block header we
should store empty block in the file header (note, that we cannot store
the empty block not in header, since this will require reading all
blocks from file, due to some trickery of the readers interface).

Note, that this patch also contains tiny refactoring:
- s/header/distributed_header/

v1: dumpNamesAndTypes()
v2: dump empty block into the batch itself
v3: move empty block into the header
2021-04-06 10:05:21 +03:00
Anton Popov
472662de80
skip test in arcadia 2021-04-06 00:45:28 +03:00
alexey-milovidov
f1efa33571
Merge branch 'master' into client-fix-highlight-multiline-comment 2021-03-26 02:40:11 +03:00
Nikolai Kochetov
e8d7349c79
Merge branch 'master' into dist-query-zero-shards-fix 2021-03-16 12:00:08 +03:00
Maksim Kita
084bd03672
Merge pull request #21692 from azat/dict-ip_trie-access_to_key_from_attributes
Fix SIGSEGV on not existing attributes from ip_trie with access_to_key_from_attributes
2021-03-13 23:13:26 +03:00
Azat Khuzhin
68dff2d954 Mark 01018_ip_dictionary as long
https://clickhouse-test-reports.s3.yandex.net/21692/481d897cadb6b1a309478f24a46efe506b7108d6/functional_stateless_tests_flaky_check_(address).html#fail1
2021-03-13 11:06:49 +03:00
Vasily Nemkov
f4246e7be5
Merge branch 'master' into governance/query_log 2021-03-12 18:31:08 +03:00
Vasily Nemkov
2cf39d3ac4 Excluded 01702_system_query_log from arcadia runs 2021-03-12 11:25:01 +02:00
Nikita Mikhaylov
eecc12ff6a suppress warnings and skip tests in arcadia 2021-03-11 15:09:03 +03:00
Azat Khuzhin
61d40c3600 Fix optimize_skip_unused_shards for zero shards case
v2: move check to the beginning of the StorageDistributed::read()
2021-03-10 09:05:14 +03:00
Maksim Kita
aa1a16d2b9 Updated tests skiplists 2021-03-06 22:17:24 +03:00
Azat Khuzhin
9c35e49878 Fix heap-buffer-overflow in highlighting multi-line comments
Not closed multi-line comment returns the whole query, so it should not
be processed further with the lexer.

ASan report:

    :) /*=================================================================
    ==14889==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60400006ebc0 at pc 0x00000a8148ea bp 0x7fffffff8610 sp 0x7fffffff7dd8
    WRITE of size 16 at 0x60400006ebc0 thread T0
        0 0xa8148e9 in __asan_memcpy (/src/ch/tmp/upstream/clickhouse-asan+0xa8148e9)
        1 0xaa8a3a4 in DB::Client::highlight(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<replxx::Replxx::Color, std::__1::allocator<replxx::Replxx::Color> >&) obj-x86_64-linux-gnu/../programs/client/Client.cpp:464:52
        2 0x25f7b6d8 in std::__1::__function::__policy_func<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<replxx::Replxx::Color, std::__1::allocator<replxx::Replxx::Color> >&)>::operator()(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<replxx::Replxx::Color, std::__1::allocator<replxx::Replxx::Color> >&) const obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2221:16
        3 0x25f7b6d8 in std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<replxx::Replxx::Color, std::__1::allocator<replxx::Replxx::Color> >&)>::operator()(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<replxx::Replxx::Color, std::__1::allocator<replxx::Replxx::Color> >&) const obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2560:12
        4 0x25f7b6d8 in replxx::Replxx::ReplxxImpl::render(replxx::Replxx::ReplxxImpl::HINT_ACTION) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:546:3
        5 0x25f74059 in replxx::Replxx::ReplxxImpl::refresh_line(replxx::Replxx::ReplxxImpl::HINT_ACTION) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:729:2
        6 0x25f6bc8f in replxx::Replxx::ReplxxImpl::insert_character(char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1197:3
        7 0x25f79347 in replxx::Replxx::ReplxxImpl::action(unsigned long long, replxx::Replxx::ACTION_RESULT (replxx::Replxx::ReplxxImpl::* const&)(char32_t), char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1130:29
        8 0x25f79347 in replxx::Replxx::ReplxxImpl::get_input_line() obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1123:11
        9 0x25f7844c in replxx::Replxx::ReplxxImpl::input(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:455:8
        10 0x25af5693 in ReplxxLineReader::readOneLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/ReplxxLineReader.cpp:108:29
        11 0x25aed149 in LineReader::readLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/LineReader.cpp:81:26
        12 0xaa80ba2 in DB::Client::mainImpl() obj-x86_64-linux-gnu/../programs/client/Client.cpp:654:33
        13 0xaa756f5 in DB::Client::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) obj-x86_64-linux-gnu/../programs/client/Client.cpp:280:20
        14 0x25c0c8b5 in Poco::Util::Application::run() obj-x86_64-linux-gnu/../contrib/poco/Util/src/Application.cpp:334:8
        15 0xaa4d050 in mainEntryClickHouseClient(int, char**) obj-x86_64-linux-gnu/../programs/client/Client.cpp:2724:23
        16 0xa848c3a in main obj-x86_64-linux-gnu/../programs/main.cpp:368:12
        17 0x7ffff7dcab24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
        18 0xa79b36d in _start (/src/ch/tmp/upstream/clickhouse-asan+0xa79b36d)

    0x60400006ebc0 is located 0 bytes to the right of 48-byte region [0x60400006eb90,0x60400006ebc0)
    allocated by thread T0 here:
        0 0xa84509d in operator new(unsigned long) (/src/ch/tmp/upstream/clickhouse-asan+0xa84509d)
        1 0x25f7af76 in void* std::__1::__libcpp_operator_new<unsigned long>(unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/new:235:10
        2 0x25f7af76 in std::__1::__libcpp_allocate(unsigned long, unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/new:261:10
        3 0x25f7af76 in std::__1::allocator<replxx::Replxx::Color>::allocate(unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:840:38
        4 0x25f7af76 in std::__1::allocator_traits<std::__1::allocator<replxx::Replxx::Color> >::allocate(std::__1::allocator<replxx::Replxx::Color>&, unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/__memory/allocator_traits.h:468:21
        5 0x25f7af76 in std::__1::vector<replxx::Replxx::Color, std::__1::allocator<replxx::Replxx::Color> >::__vallocate(unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/vector:993:37
        6 0x25f7af76 in std::__1::vector<replxx::Replxx::Color, std::__1::allocator<replxx::Replxx::Color> >::vector(unsigned long, replxx::Replxx::Color const&) obj-x86_64-linux-gnu/../contrib/libcxx/include/vector:1155:9
        7 0x25f7af76 in replxx::Replxx::ReplxxImpl::render(replxx::Replxx::ReplxxImpl::HINT_ACTION) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:543:19
        8 0x25f74059 in replxx::Replxx::ReplxxImpl::refresh_line(replxx::Replxx::ReplxxImpl::HINT_ACTION) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:729:2
        9 0x25f6bc8f in replxx::Replxx::ReplxxImpl::insert_character(char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1197:3
        10 0x25f79347 in replxx::Replxx::ReplxxImpl::action(unsigned long long, replxx::Replxx::ACTION_RESULT (replxx::Replxx::ReplxxImpl::* const&)(char32_t), char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1130:29
        11 0x25f79347 in replxx::Replxx::ReplxxImpl::get_input_line() obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1123:11
        12 0x25f7844c in replxx::Replxx::ReplxxImpl::input(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:455:8
        13 0x25af5693 in ReplxxLineReader::readOneLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/ReplxxLineReader.cpp:108:29
        14 0x25aed149 in LineReader::readLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/LineReader.cpp:81:26
        15 0xaa80ba2 in DB::Client::mainImpl() obj-x86_64-linux-gnu/../programs/client/Client.cpp:654:33
        16 0xaa756f5 in DB::Client::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) obj-x86_64-linux-gnu/../programs/client/Client.cpp:280:20
        17 0x25c0c8b5 in Poco::Util::Application::run() obj-x86_64-linux-gnu/../contrib/poco/Util/src/Application.cpp:334:8
        18 0xaa4d050 in mainEntryClickHouseClient(int, char**) obj-x86_64-linux-gnu/../programs/client/Client.cpp:2724:23
        19 0xa848c3a in main obj-x86_64-linux-gnu/../programs/main.cpp:368:12
        20 0x7ffff7dcab24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

    SUMMARY: AddressSanitizer: heap-buffer-overflow (/src/ch/tmp/upstream/clickhouse-asan+0xa8148e9) in __asan_memcpy

v2: fix lexer instead of client quirk
2021-03-06 12:23:41 +03:00
Alexey Milovidov
c3110a611e Skip Arcadia 2021-03-05 05:06:49 +03:00
Azat Khuzhin
6965ac26c3 Distributed: Add ability to delay/throttle INSERT until pending data will be reduced
Add two new settings for the Distributed engine:
- bytes_to_delay_insert
- max_delay_to_insert

If at the beginning of INSERT there will be too much pending data, more
then bytes_to_delay_insert, then the INSERT will wait until it will be
shrinked, and not more then max_delay_to_insert seconds.

If after this there will be still too much pending, it will throw an
exception.

Also new profile events were added (by analogy to the MergeTree):
- DistributedDelayedInserts (although you can use system.errors instead
  of this, but still)
- DistributedRejectedInserts
- DistributedDelayedInsertsMilliseconds
2021-03-03 23:30:23 +03:00
Azat Khuzhin
cabe4ca1bb tests: split 00753_system_columns_and_system_tables (to disable Distributed part for arcadia) 2021-03-03 23:30:03 +03:00
Azat Khuzhin
b5a5778589 Distributed: Add ability to limit amount of pending bytes for async INSERT
Right now with distributed_directory_monitor_batch_inserts=1 and
insert_distributed_sync=0 INSERT into Distributed table will store
blocks that should be sent to remote (and in case of
prefer_localhost_replica=0 to the localhost too) on the local
filesystem, and sent it in background.

However there is no limit for this storage, and if the remote is
unavailable (or some other error), these pending blocks may take
significant space, and this is not always desired behaviour.

Add new Distributed setting - bytes_to_throw_insert, that will set the
limit for how much pending bytes is allowed, if the limit will be
reached an exception will be throw.

By default was set to 0, to avoid surprises.
2021-03-03 23:30:00 +03:00
Alexey Milovidov
e8df9971f1 Fix Arcadia 2021-03-03 18:12:39 +03:00