Commit Graph

23340 Commits

Author SHA1 Message Date
Maksim Kita
e5b85953e8 Added unit tests 2022-03-10 21:45:31 +00:00
Maksim Kita
6b916c7bb5 Fixed tests 2022-03-10 21:45:31 +00:00
Maksim Kita
5b2be4d3b8 Fixed tests 2022-03-10 21:45:31 +00:00
Maksim Kita
cbe059f4bd Updated IColumn interface to support getting stable permutation 2022-03-10 21:45:31 +00:00
Maksim Kita
765cd09d06 MergeTree improve insert performance 2022-03-10 21:45:31 +00:00
Maksim Kita
493169910b
Merge pull request #35174 from zhangyifan27/fix_typo
fix typos
2022-03-10 17:10:44 +01:00
Kseniia Sumarokova
3fc399b6e9
Merge pull request #35158 from kssenii/fix-materialized-postgresql
Fix materialised postrgesql adding new table after manually removing it
2022-03-10 17:02:32 +01:00
Kseniia Sumarokova
e30b0c5d57
Merge pull request #35162 from kssenii/fix-materialized-postgresql-table-override
Fix materialised postgres `table overrides` for partition by, etc
2022-03-10 17:01:24 +01:00
mergify[bot]
8434dd424a
Merge branch 'master' into fix_typo 2022-03-10 15:30:06 +00:00
Kruglov Pavel
a506120646
Fix bug in schema inference in s3 table function (#35176) 2022-03-10 15:16:07 +01:00
mergify[bot]
df01290e73
Merge branch 'master' into fix_typo 2022-03-10 13:35:04 +00:00
Miel Donkers
4a95e6d602
Parsing YAML config to XML leads to incorrect structures (#35135) 2022-03-10 13:09:48 +01:00
kssenii
1dc3f36a11 Better 2022-03-10 12:19:20 +01:00
Vladimir C
84af08b1a1
Merge pull request #35116 from bigo-sg/snappy_bug 2022-03-10 11:47:37 +01:00
Kseniia Sumarokova
e6ee891c9c
Merge pull request #34957 from bigo-sg/hive_random_access_file_cache
Optimization for first time to read a random access readbuffer in hive
2022-03-10 11:36:22 +01:00
zhangyifan27
e6fa9f699a fix typo 2022-03-10 18:29:42 +08:00
kssenii
3cd1da1e11 Fix 2022-03-10 11:11:59 +01:00
lgbo-ustc
fdd423a3da fixed code style 2022-03-10 12:13:19 +08:00
lgbo-ustc
e4883f31b7 update tests
1. fixed code style in src/IO/tests/gtest_hadoop_snappy_decoder.cpp
2. enable tests 01060_avro.sh
2022-03-10 09:46:43 +08:00
kssenii
af9d8d278e Fix 2022-03-09 19:25:43 +01:00
Vladimir C
ce266b5a3e
Merge pull request #35146 from amosbird/fixpartitionprunerin 2022-03-09 13:23:45 +01:00
Nikita Mikhaylov
d749295222
Fix hardcoded page size (#35129) 2022-03-09 12:35:23 +01:00
Nikolai Kochetov
6bfee7aca2
Merge pull request #35132 from azat/parallel_distributed_insert_select-view
Support view() for parallel_distributed_insert_select
2022-03-09 09:10:34 +01:00
Nikolai Kochetov
c364908061
Merge pull request #35094 from amosbird/getridofredundantplan
Get rid of duplicate query planing.
2022-03-09 09:10:20 +01:00
Amos Bird
a19224bc9b
Fix partition pruner: non-monotonic function IN 2022-03-09 15:48:42 +08:00
lgbo-ustc
8dc8c87fd1 add a test case 2022-03-09 10:03:04 +08:00
Azat Khuzhin
4843e210c3 Support view() for parallel_distributed_insert_select
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 22:05:57 +03:00
Azat Khuzhin
75da778d10 Tiny cmake refactoring
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 22:05:57 +03:00
Azat Khuzhin
fd3f7347f3 Remove unused DBMS_COMMON_LIBRARIES
Fixes: 4f8438bb34 ("cmake: do not allow internal libstdc++ usage")
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 22:02:18 +03:00
Azat Khuzhin
132bbce29c Add ability to get SELECT query from TableFunctionView
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 22:02:18 +03:00
Azat Khuzhin
ced34dea84 Take flush_time into account for scheduling background flush of the Buffer
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 21:58:10 +03:00
Azat Khuzhin
a871036361
Fix parallel_reading_from_replicas with clickhouse-bechmark (#34751)
* Use INITIAL_QUERY for clickhouse-benchmark

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Fix parallel_reading_from_replicas with clickhouse-bechmark

Before it produces the following error:

    $ clickhouse-benchmark --stacktrace -i1 --query "select * from remote('127.1', default.data_mt) limit 10" --allow_experimental_parallel_reading_from_replicas=1 --max_parallel_replicas=3
    Loaded 1 queries.
    Logical error: 'Coordinator for parallel reading from replicas is not initialized'.
    Aborted (core dumped)

Since it uses the same code, i.e RemoteQueryExecutor ->
MultiplexedConnections, which enables coordinator if it was requested
from settings, but it should be done only for non-initial queries, i.e.
when server send connection to another server.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Fix 02226_parallel_reading_from_replicas_benchmark for older shellcheck

By shellcheck 0.8 does not complains, while on CI shellcheck 0.7.0 and
it does complains [1]:

    In 02226_parallel_reading_from_replicas_benchmark.sh line 17:
        --allow_experimental_parallel_reading_from_replicas=1
        ^-- SC2191: The = here is literal. To assign by index, use ( [index]=value ) with no spaces. To keep as literal, quote it.

    Did you mean:
        "--allow_experimental_parallel_reading_from_replicas=1"

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/34751/d883af711822faf294c876b017cbf745b1cda1b3/style_check__actions_/shellcheck_output.txt

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 16:42:29 +01:00
Azat Khuzhin
c4b6342853
Improvements for parallel_distributed_insert_select (and related) (#34728)
* Add a warning if parallel_distributed_insert_select was ignored

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Respect max_distributed_depth for parallel_distributed_insert_select

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Print warning for non applied parallel_distributed_insert_select only for initial query

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Remove Cluster::getHashOfAddresses()

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Forbid parallel_distributed_insert_select for remote()/cluster() with different addresses

Before it uses empty cluster name (getClusterName()) which is not
correct, compare all addresses instead.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Fix max_distributed_depth check

max_distributed_depth=1 must mean not more then one distributed query,
not two, since max_distributed_depth=0 means no limit, and
distribute_depth is 0 for the first query.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Fix INSERT INTO remote()/cluster() with parallel_distributed_insert_select

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Add a test for parallel_distributed_insert_select with cluster()/remote()

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Return <remote> instead of empty cluster name in Distributed engine

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Make user with sharding_key and w/o in remote()/cluster() identical

Before with sharding_key the user was "default", while w/o it it was
empty.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 15:24:39 +01:00
lgbo-ustc
7f89a1bcf3 add some usage test 2022-03-08 20:00:39 +08:00
Antonio Andelic
bc5d7aea57
Merge pull request #34876 from azat/long-INSERT-fix
Fix possible "Part directory doesn't exist" during INSERT
2022-03-08 12:44:53 +01:00
Kseniia Sumarokova
517e878c6e
Merge pull request #35099 from ClickHouse/tavplubix-patch-1
Fix inconsistency in DiskLocal
2022-03-08 10:18:07 +01:00
Kseniia Sumarokova
1eb2bae792
Merge pull request #34954 from bigo-sg/hive_read_columns_pruning
read columns pruning for hive
2022-03-08 10:17:24 +01:00
lgbo-ustc
225b0bd914 fixed bug: call need_more_input repeatly, overwrite the buffer 2022-03-08 17:17:06 +08:00
lgbo-ustc
256e92ffee Merge remote-tracking branch 'ck/master' into hive_random_access_file_cache 2022-03-08 14:14:40 +08:00
Azat Khuzhin
caffc144b5 Fix possible "Part directory doesn't exist" during INSERT
In #33291 final part commit had been defered, and now it can take
significantly more time, that may lead to "Part directory doesn't exist"
error during INSERT:

    2022.02.21 18:18:06.979881 [ 11329 ] {insert} <Debug> executeQuery: (from 127.1:24572, user: default) INSERT INTO db.table (...) VALUES
    2022.02.21 20:58:03.933593 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18044_18044_0 to 20220214_270654_270654_0.
    2022.02.21 21:16:50.961917 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18197_18197_0 to 20220214_270689_270689_0.
    ...
    2022.02.22 21:16:57.632221 [ 64878 ] {} <Warning> db.table: Removing temporary directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/
    ...
    2022.02.23 12:23:56.277480 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18232_18232_0 to 20220214_273459_273459_0.
    2022.02.23 12:23:56.299218 [ 11329 ] {insert} <Error> executeQuery: Code: 107. DB::Exception: Part directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/ doesn't exist. Most likely it is a logical error. (FILE_DOESNT_EXIST) (version 22.2.1.1) (from 127.1:24572) (in query: INSERT INTO db.table (...) VALUES), Stack trace (when copying this message, always include the lines below):

Follow-up for: #28760
Refs: #33291

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 07:44:11 +03:00
lgbo-ustc
a8cfc2458a update codes 2022-03-08 11:55:15 +08:00
alexey-milovidov
df1a031851
Merge pull request #35046 from vdimir/issue-35044
Fix trim function
2022-03-08 01:50:02 +03:00
Maksim Kita
2f9361008b
Merge pull request #35089 from 1lann/1lann/fix-update_lag-typo
Fix typo of update_lag
2022-03-07 23:12:35 +01:00
Nikolai Kochetov
8f77b2b778
Merge pull request #34889 from ClickHouse/finally-enable-s3-async-writes-again
Update DiskS3.cpp
2022-03-07 21:31:44 +01:00
alesapin
71ecd6be74
Merge pull request #35004 from ClickHouse/more_keeper_sanity_checks
More keeper sanity checks
2022-03-07 19:53:59 +01:00
Kseniia Sumarokova
5511f2f6e6
Merge pull request #34940 from bigo-sg/hive_client_connection_pool
Use connection pool in HiveMetastoreClient
2022-03-07 17:14:56 +01:00
Kseniia Sumarokova
28b9ec01c0
Merge pull request #34945 from bigo-sg/hive_bug_fixed
unexpected result when use `in` in hive query
2022-03-07 17:13:11 +01:00
Amos Bird
fe4534d464
Get rid of duplicate query planing. 2022-03-08 00:02:58 +08:00
tavplubix
84e22fb32b
Update DiskLocal.cpp 2022-03-07 18:59:00 +03:00
Antonio Andelic
81e56a06a3
Merge pull request #35066 from ClickHouse/remove-useless-define
Remove useless define
2022-03-07 15:28:28 +01:00