Commit Graph

42263 Commits

Author SHA1 Message Date
Kruglov Pavel
18be731e30
Merge branch 'master' into fix-secure-async-read-write 2023-05-26 00:46:33 +02:00
Kruglov Pavel
f03ca41b08
Fix build 2023-05-26 00:21:46 +02:00
Kruglov Pavel
67b78829fc
Fix build 2023-05-26 00:21:14 +02:00
Kruglov Pavel
a9082b24b4
Fix build 2023-05-26 00:20:20 +02:00
Robert Schulze
bc869eac7b
Merge branch 'master' into msan-siphash-keyed 2023-05-26 00:18:07 +02:00
Alexey Gerasimchuk
8d7cb7fc3b
Merge branch 'master' into ADQM-830 2023-05-26 07:49:51 +10:00
Alexey Gerasimchuk
613568423d
Update src/Processors/Formats/Impl/CSVRowInputFormat.cpp
Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
2023-05-26 07:49:45 +10:00
serxa
4f0aeee236 fix more conflicts 2023-05-25 20:50:22 +00:00
Kruglov Pavel
1964d1bb7e
Fix comment 2023-05-25 22:30:16 +02:00
avogar
42e1e3ae20 Fix working with secure socket after async connection 2023-05-25 20:24:03 +00:00
Robert Schulze
c4f91a1c45
Merge branch 'master' into space 2023-05-25 19:56:20 +02:00
Robert Schulze
ad4a21034f
Fix msan issue in keyed siphash
Issue:
https://s3.amazonaws.com/clickhouse-test-reports/0/ffdd91669471f4934704f98f0191524496b4e85b/fuzzer_astfuzzermsan/report.html

Repro:
SELECT hex(sipHash128ReferenceKeyed((toUInt64(2), toUInt64(-9223372036854775807)))) GROUP BY (toUInt64(506097522914230528), toUInt64(now64(2, NULL + NULL), 1084818905618843912)), toUInt64(2), NULL + NULL, char(-2147483649, 1)

Minimal repro:
SELECT sipHash64Keyed((2::UInt64, toUInt64(2)), 4) GROUP BY toUInt64(2)
2023-05-25 17:52:03 +00:00
Robert Schulze
eca08438f4
Fix macos build 2023-05-25 17:05:18 +00:00
alesapin
e4c1e2f232 Fix build while it's not failing locally 2023-05-25 17:37:09 +02:00
alesapin
e94b0c8e5e Fix bug 2023-05-25 16:38:19 +02:00
Nikita Mikhaylov
0580859e6f Better 2023-05-25 14:05:44 +00:00
kssenii
3fefacbf20 Fix 2023-05-25 15:48:56 +02:00
Nikita Mikhaylov
cf6ff7ab32 Merge branch 'master' of github.com:ClickHouse/ClickHouse into 46229-repl-clickhouse-keeper 2023-05-25 13:41:25 +00:00
serxa
0ca526c603 Unify priorities: rework IO scheduling subsystem 2023-05-25 13:25:41 +00:00
serxa
3ef6cb7bdc git-apply #50205 2023-05-25 13:24:45 +00:00
Kseniia Sumarokova
f1a3c9cfd5
Merge pull request #50109 from kssenii/abstract-async-prefetched-buffer
Make async prefetched buffer work with arbitrary impl
2023-05-25 15:06:44 +02:00
serxa
b8d3e495e5 add pool_id out-of-bound checks 2023-05-25 12:42:19 +00:00
Nikita Mikhaylov
1c3b6738f4
Fixes for parallel replicas (#50195) 2023-05-25 14:41:04 +02:00
serxa
e3ce2f834a fix style 2023-05-25 12:35:00 +00:00
Sergei Trifonov
2df22d396d
Merge branch 'master' into async-loader-workloads 2023-05-25 14:32:12 +02:00
Robert Schulze
8804dfd4b0
Fix resizing 2023-05-25 11:57:09 +00:00
Sergei Trifonov
78c89da8bb
Update src/Common/AsyncLoader.cpp
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2023-05-25 13:48:03 +02:00
Sergei Trifonov
32cb9931b6
Update src/Common/AsyncLoader.h
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2023-05-25 13:47:56 +02:00
Han Fei
55c2dbcc2d
Merge pull request #50062 from ZhiguoZh/20230511-toyear
Optimize predicate with toYear converter
2023-05-25 13:25:55 +02:00
avogar
ce99825200 Fix skipping spaces at end of row in CustomSeparatedIgnoreSpaces format 2023-05-25 11:19:15 +00:00
vdimir
2df4130d82
Update src/Interpreters/ClusterDiscovery.cpp 2023-05-25 13:06:40 +02:00
alesapin
5b76ab4e03 Fix build 2023-05-25 13:02:40 +02:00
Sema Checherinda
23f894b995
Merge pull request #49777 from helifu/master1
Add 'initial_query_id' field for system.processors_profile_log
2023-05-25 12:55:32 +02:00
Sema Checherinda
3329a8428d
Merge pull request #49779 from helifu/master3
Add 'partitions' field for system.query_log
2023-05-25 12:51:40 +02:00
Igor Nikonov
1c0b02c3c4
Merge pull request #49503 from ClickHouse/fill_with_by_sorting_prefix_2
WITH FILL by sorting prefix
2023-05-25 12:37:40 +02:00
alesapin
b2c9611da6 Fix build 2023-05-25 12:01:24 +02:00
何李夫
e4c8c4cecf
Add zookeeper name in endpoint id (#49780)
* Add zookeeper name in endpoint id

When we migrate a replicated table from one zookeeper cluster to
another (the reason why we migration is that zookeeper's load is
too high), we will create a new table with the same zpath, but it
will fail and the old table will be in trouble.

Here is some infomation:
1.old table:
  CREATE TABLE a1 (`id` UInt64)
  ENGINE = ReplicatedMergeTree('/clickhouse/tables/default/a1/{shard}', '{replica}')
  ORDER BY (id);
2.new table:
  CREATE TABLE a2 (`id` UInt64)
  ENGINE = ReplicatedMergeTree('aux1:/clickhouse/tables/default/a1/{shard}', '{replica}')
  ORDER BY (id);
3.error info:
  <Error> executeQuery: Code: 220. DB::Exception: Duplicate interserver IO endpoint:
          DataPartsExchange:/clickhouse/tables/default/a1/01/replicas/02.
          (DUPLICATE_INTERSERVER_IO_ENDPOINT)
  <Error> InterserverIOHTTPHandler: Code: 221. DB::Exception: No interserver IO endpoint
          named DataPartsExchange:/clickhouse/tables/default/a1/01/replicas/02.
          (NO_SUCH_INTERSERVER_IO_ENDPOINT)

* Revert "Add zookeeper name in endpoint id"

This reverts commit 9deb75b249619b7abdd38e3949ca8b3a76c9df8e.

* Add zookeeper name in endpoint id

When we migrate a replicated table from one zookeeper cluster to
another (the reason why we migration is that zookeeper's load is
too high), we will create a new table with the same zpath, but it
will fail and the old table will be in trouble.

* Fix incompatible with a new setting

* add a test, fix other issues

* Update 02442_auxiliary_zookeeper_endpoint_id.sql

* Update 02735_system_zookeeper_connection.reference

* Update 02735_system_zookeeper_connection.sql

* Update run.sh

* Remove the 'no-fasttest' tag

* Update 02442_auxiliary_zookeeper_endpoint_id.sql

---------

Co-authored-by: Alexander Tokmakov <tavplubix@clickhouse.com>
Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>
2023-05-25 12:50:14 +03:00
alesapin
8a7c4bee53 Merge branch 'master' into transactions_for_encrypted_disk 2023-05-25 11:35:17 +02:00
Azat Khuzhin
b680697cce Initialize POD members of ASTs to make it less error-prone
The cost of initializing members is insignificant in compare to parsing,
while the cost of the error is high.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-25 10:18:55 +02:00
Azat Khuzhin
b30cfe5503 Fix UB in ASTWatchQuery for is_watch_events
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-25 10:15:30 +02:00
Azat Khuzhin
c053d75741 Fix formatting of INTO OUTFILE extensions (APPEND / AND STDOUT)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-25 10:15:30 +02:00
Azat Khuzhin
9582d9e892 Fix UB for INTO OUTFILE extensions (APPEND / AND STDOUT)
MSAn report:

==38627==WARNING: MemorySanitizer: use-of-uninitialized-value
    0 0x555599f5e114 in std::__1::__unique_if<DB::WriteBufferFromFile>::__unique_single std::__1::make_unique[abi:v15000]<> build_docker/./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:714:32
    1 0x555599f5e114 in DB::ClientBase::initOutputFormat() build_docker/./src/Client/ClientBase.cpp:604:21
    2 0x555599f590a8 in DB::ClientBase::onData() build_docker/./src/Client/ClientBase.cpp:446:5
    3 0x555599f6f36e in DB::ClientBase::receiveAndProcessPacket() build_docker/./src/Client/ClientBase.cpp:1019:17
    4 0x555599f6e863 in DB::ClientBase::receiveResult() build_docker/./src/Client/ClientBase.cpp:987:18
    5 0x555599f6c05b in DB::ClientBase::processOrdinaryQuery() build_docker/./src/Client/ClientBase.cpp:905:13
    6 0x555599f67e05 in DB::ClientBase::processParsedSingleQuery() build_docker/./src/Client/ClientBase.cpp:1711:13
    7 0x555599f86fb6 in DB::ClientBase::executeMultiQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) build_docker/./src/Client/ClientBase.cpp:1975:21

  Uninitialized value was created by a heap allocation
    8 0x55559bd3e038 in DB::ParserExplainQuery::parseImpl(DB::IParser::Pos&, std::__1::shared_ptr<DB::IAST>&, DB::Expected&) build_docker/./src/Parsers/ParserExplainQuery.cpp:53:26
    9 0x55559bce31f4 in DB::IParserBase::parse(DB::IParser::Pos&, std::__1::shared_ptr<DB::IAST>&, DB::Expected&)::$_0::operator()() const build_docker/./src/Parsers/IParserBase.cpp:13:20
    ..
    21 0x55559be13b5c in DB::parseQueryAndMovePosition(DB::IParser&, char const*&, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, unsigned long) build_docker/./src/Parsers/parseQuery.cpp:357:18
    22 0x555599f5673a in DB::ClientBase::parseQuery(char const*&, char const*, bool) const build_docker/./src/Client/ClientBase.cpp:362:15
    23 0x555599f84a4f in DB::ClientBase::analyzeMultiQueryText() build_docker/./src/Client/ClientBase.cpp:1821:24
    24 0x555599f867b3 in DB::ClientBase::executeMultiQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) build_docker/./src/Client/ClientBase.cpp:1910:22
    25 0x555599f8a2fd in DB::ClientBase::processQueryText(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) build_docker/./src/Client/ClientBase.cpp:2120:12
    26 0x555599f94aee in DB::ClientBase::runNonInteractive() build_docker/./src/Client/ClientBase.cpp:2403:9

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-25 10:15:23 +02:00
Alexey Gerasimchuck
75791d7a63 Added input_format_csv_trim_whitespaces parameter 2023-05-25 07:51:32 +00:00
Zhiguo Zhou
1bc4eb1a6c OptimizeDateFilterVisitor: Revise variable names for clarity 2023-05-25 13:47:03 +08:00
helifu
802b63f2ab Add 'initial_query_id' field for system.processors_profile_log
Facilitate profile data association and aggregation for the same query
2023-05-25 09:37:02 +08:00
Zhiguo Zhou
773a5bbbaa Optimize predicate with toYear converter
The date converters, such as toYear, are widely used in the where
clauses of the SQL queries, however, these conversions are often
expensive due to the complexity of the calendar system.

The function preimage is found an optimization for the predicates
with the converters. Given a predicate, toYear(c) = y, we could
convert it to its equivalent form: c >= b AND c <= e, where b is
"y-01-01" and e is "y-12-31". The similar transformation applies
to other comparisons (<>, <, >, <=, <=).

This commit implemented the above transformation at the AST level
by adding a new pass in the TreeOptimizer and a new AST visitor
for in-place replacing the predicates of toYear with the converted
ones.
2023-05-25 09:11:51 +08:00
Kseniia Sumarokova
2e17503d36
Merge pull request #50187 from kssenii/fix-pg-source
Fix PostgreSQLSource reading all unread the data in onFinish
2023-05-24 22:51:48 +02:00
Alexander Gololobov
8996fcb090
Merge pull request #50193 from ClickHouse/fix_for_replicate_delete
Don't replicate delete through DDL worker if there is just 1 shard
2023-05-24 22:45:00 +02:00
Dan Roscigno
026a15d8a7
Update dns_max_consecutive_failures docs (#50196)
Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-05-24 21:01:59 +02:00
Alexander Sapin
3e69648268 Fxi moar 2023-05-24 20:02:03 +02:00
Alexander Sapin
1c627fbcab Fxi 2023-05-24 20:01:36 +02:00
Victor Krasnov
03ca3f96d2 Add built-in documentation to toStartOfWeek and toLastDayOfWeek functions 2023-05-24 17:40:21 +00:00
Alexander Sapin
4a4246a8cc Dedup 2023-05-24 19:39:53 +02:00
Alexander Sapin
5676a2c880 Small refactoring of encrypted disk 2023-05-24 19:34:51 +02:00
avogar
bc527c7588 Don't send head request for all keys in Iceberg schema inference 2023-05-24 17:07:31 +00:00
Alexander Sapin
2a3362e0c8 Implement encrypted disk transaction and fix shared merge tree with encrypted disk 2023-05-24 17:44:40 +02:00
Alexander Gololobov
de0a074545 Don't replicate delete through DDL worker if there is just 1 shard 2023-05-24 16:10:31 +02:00
ltrk2
f76f989b53 Implement a uniform way to query processor core IDs 2023-05-24 13:33:05 +00:00
kssenii
07eedc8ef1 Fix 2023-05-24 15:03:11 +02:00
alesapin
7c0c49c9d2
Merge pull request #50154 from hanfei1991/hanfei/fix-modify-order-by
do not allow modify order by when there are no order by cols
2023-05-24 15:01:38 +02:00
helifu
2255b0287a Add 'partitions' field for system.query_log 2023-05-24 20:42:31 +08:00
Igor Nikonov
2f5ed81e0d
Merge branch 'master' into fill_with_by_sorting_prefix_2 2023-05-24 14:40:44 +02:00
Alexander Tokmakov
ffdd916694
Merge pull request #50180 from ClickHouse/tavplubix-patch-6
Update an exception message
2023-05-24 15:01:50 +03:00
Kruglov Pavel
9545100c9e
Merge pull request #45427 from attack204/urlCluster
Add urlCluster table function and refactor all *Cluster table functions
2023-05-24 13:32:56 +02:00
Alexander Tokmakov
486153d581
Update MergeTreeData.cpp 2023-05-24 13:33:28 +03:00
Kseniia Sumarokova
91eb3ad2bc
fix clang-tidy build 2023-05-24 12:14:15 +02:00
vdimir
3f892ceb12
Merge pull request #49816 from bigo-sg/grace_hash_reserve_hash_table 2023-05-24 11:48:19 +02:00
LiuYangkuan
0df4164180 Merge remote-tracking branch 'origin/master' into cluster_discovery 2023-05-24 17:37:01 +08:00
Amos Bird
8bbfdcc56c
Fix index analysis with binary operator null 2023-05-24 15:47:38 +08:00
Amos Bird
b11aa42db9
Fix tests 2023-05-24 14:27:49 +08:00
robot-ch-test-poll2
2b48a483f2
Merge pull request #50151 from ClickHouse/Avogar-patch-1
Change fields destruction order in AsyncTaskExecutor
2023-05-24 03:49:32 +02:00
Alexey Milovidov
3e1267c839
Merge pull request #50152 from ClickHouse/tavplubix-patch-6
Follow-up to #49889
2023-05-24 01:05:24 +03:00
Robert Schulze
889489b02e
Merge branch 'master' into space 2023-05-23 23:18:19 +02:00
Igor Nikonov
e9c86527b0
Merge branch 'master' into fill_with_by_sorting_prefix_2 2023-05-23 22:58:21 +02:00
Han Fei
037c5f8a06
Merge branch 'master' into hanfei/fix-modify-order-by 2023-05-23 20:48:27 +02:00
Igor Nikonov
2c01104c3f Clarification comment on retries controller behavior 2023-05-23 17:30:22 +00:00
Han Fei
584c05d8b8 fix modify order by when there was no order by cols 2023-05-23 18:54:36 +02:00
Raúl Marín
db4b3d19ae
Clearer coordinator log (#50101) 2023-05-23 17:30:27 +02:00
Alexander Tokmakov
64ee8ebb12
Update MutateTask.cpp 2023-05-23 18:11:08 +03:00
Amos Bird
b82ff979d0
Fix invalid index analysis for date related keys 2023-05-23 23:10:34 +08:00
Igor Nikonov
8645af5809 Hoping to get into next release 2023-05-23 14:54:22 +00:00
Kruglov Pavel
4689412ab3
Change fields destruction order in AsyncTaskExecutor 2023-05-23 16:14:24 +02:00
Anton Popov
3a955661da
Merge pull request #50123 from CurtizJ/fix-multiif-crash
Fix crash with `multiIf` and constant condition and nullable arguments
2023-05-23 14:29:32 +02:00
Robert Schulze
285e8f4ae1
Protect against DOS 2023-05-23 12:16:49 +00:00
SmitaRKulkarni
55af60ea3f
Merge branch 'master' into 42192_Lower_parallel_parsing_threads_with_globs 2023-05-23 13:59:15 +02:00
Robert Schulze
f850a448ec
Merge pull request #49870 from arenadata/ADQM-808
clickhouse-client: accept queries after "--multiquery" argument
2023-05-23 13:48:52 +02:00
avogar
3c1aeaaa79 Change default value of handshake_timeout to 10 sec, fix possible use-after-free 2023-05-23 11:39:40 +00:00
kssenii
241e75197e Fix 2023-05-23 13:31:50 +02:00
Antonio Andelic
3e6314675c
Merge pull request #49930 from AVMusorin/write-buffer-from-s3
Fix metrics `WriteBufferFromS3Bytes`, `WriteBufferFromS3Microseconds` and `WriteBufferFromS3RequestsErrors`
2023-05-23 12:26:05 +02:00
Igor Nikonov
fbcc944d2f Merge remote-tracking branch 'origin/master' into fill_with_by_sorting_prefix_2 2023-05-23 10:25:38 +00:00
Kseniia Sumarokova
13cd2d6d5f
Merge pull request #50021 from kssenii/fix-logical-error-in-try-reserve
Fix logical error in stress test "Not enough space to add ..."
2023-05-23 12:05:01 +02:00
Alexander Tokmakov
3ac7bc90ef
Merge pull request #50108 from ClickHouse/update_replicated_database_settings
Update default settings for Replicated database
2023-05-23 12:56:12 +03:00
Kruglov Pavel
136c3caf03
Merge branch 'master' into handshake-timeout 2023-05-23 11:53:54 +02:00
Kruglov Pavel
66e111a6aa
Fix tests 2023-05-23 11:52:44 +02:00
Kruglov Pavel
0e346c78ae
Merge pull request #49960 from Avogar/fix-tsv-nullable-parsing
Fix possible Logical error on bad Nullable parsing for text formats
2023-05-23 11:42:07 +02:00
Alexander Tokmakov
141a72d694
Merge pull request #49637 from ClickHouse/less_zookeeper_requests
Provide better partitions hint for merge selecting task
2023-05-23 12:40:39 +03:00
robot-clickhouse-ci-1
a05088ab73
Merge pull request #50105 from ClickHouse/analyzer-table-function-fix
Analyzer: Do not execute table functions multiple times
2023-05-23 10:16:34 +02:00
Alexey Gerasimchuk
30f3b3ba04
Merge branch 'master' into ADQM-808 2023-05-23 17:03:54 +10:00
Robert Schulze
f4c73e94d2
Merge pull request #49989 from arenadata/ADQM-811
Add support of Date|Date32 arguments to the toUnixTimestamp() function
2023-05-23 08:55:56 +02:00
Robert Schulze
d9a7227cf4
Fix style check 2023-05-23 06:49:19 +00:00
Victor Krasnov
07d9f33b2e Improve toFirstDayNumOfWeek infinitesimally 2023-05-23 04:01:44 +00:00
Alexey Gerasimchuk
df751f1bca
Merge branch 'master' into ADQM-808 2023-05-23 13:43:18 +10:00
robot-ch-test-poll4
945673565c
Merge pull request #50059 from ClickHouse/cache-try-reserver-cleanup
FileCache: simple tryReserve() cleanup
2023-05-23 04:02:37 +02:00
Alexey Gerasimchuck
ab5e16a713 Changes after second review iteration 2023-05-23 00:27:17 +00:00
Anton Popov
f8905acb46 fix crash with multiif and constant condition and nullable arguments 2023-05-22 23:31:50 +00:00
Alexey Milovidov
dc4cb5223b
Merge pull request #50056 from kitaisreal/jit-compilation-not-equals-nan-fix
JIT compilation not equals NaN fix
2023-05-23 01:45:14 +03:00
Igor Nikonov
3a29f275e0 Fix: do not generate suffix on new chunk if didn't reach current range end 2023-05-22 21:50:12 +00:00
Kseniia Sumarokova
ce6054590f
Fix bad merge 2023-05-22 22:49:09 +02:00
avogar
646eeb63a4 Fix build 2023-05-22 19:46:05 +00:00
avogar
4f85d6a1bb Merge branch 'master' of github.com:ClickHouse/ClickHouse into random-structure 2023-05-22 19:43:24 +00:00
Kruglov Pavel
cee6c3914f
Fix build 2023-05-22 21:36:55 +02:00
avogar
bf19765c9b Fix possible use-of-uninitialized-value 2023-05-22 19:34:19 +00:00
Kruglov Pavel
979df2a028
Merge branch 'master' into fix-tsv-nullable-parsing 2023-05-22 21:25:05 +02:00
avogar
88e4c93abc Merge branch 'master' of github.com:ClickHouse/ClickHouse into urlCluster 2023-05-22 19:19:57 +00:00
Robert Schulze
d76498dca0
reserve() --> resize() 2023-05-22 19:19:08 +00:00
Robert Schulze
d5cfcdfae1
String terminator: \n --> \0 2023-05-22 19:10:03 +00:00
Igor Nikonov
4c97ca0a9e Merge remote-tracking branch 'origin/master' into fill_with_by_sorting_prefix_2 2023-05-22 18:59:58 +00:00
avogar
2541ad69d5 Fix bad conflicts resolving 2023-05-22 18:23:39 +00:00
avogar
17b639c612 Make better 2023-05-22 18:22:05 +00:00
kssenii
295fe3b228 Merge remote-tracking branch 'upstream/master' into abstract-async-prefetched-buffer 2023-05-22 20:01:38 +02:00
kssenii
c4d862a16f Make async reader work with any impl 2023-05-22 19:54:04 +02:00
Alexander Tokmakov
5b768ebd97 update default settings for Replicated database 2023-05-22 19:32:32 +02:00
Alexander Tokmakov
c2b1e8ca0d fix 2023-05-22 19:30:28 +02:00
Dmitry Novik
0ad041e7c3 Remove redundant code 2023-05-22 16:42:14 +00:00
Dmitry Novik
fc10ba871f Analyzer: Do not execute table functions multiple times 2023-05-22 16:36:34 +00:00
Kseniia Sumarokova
c6e4db969f
Merge pull request #50028 from kssenii/some-minor-changes
Move some common code to common
2023-05-22 18:25:24 +02:00
Sergei Trifonov
f099e3f41e
Merge pull request #50045 from azat/ddl-fix-opentelemetry
Add proper escaping for DDL OpenTelemetry context serialization
2023-05-22 18:20:17 +02:00
Nikolay Degterinsky
d4b89cb643
Merge pull request #49356 from Ziy1-Tan/vcol
Support for `_path` and `_file` virtual columns for table function `url`.
2023-05-22 18:10:32 +02:00
Nikolay Degterinsky
7bed59e1d2
Merge pull request #50000 from evillique/add-schema-inference
Add schema inference to more table engines
2023-05-22 17:24:30 +02:00
Robert Schulze
df436b2cd4
Spark compatibility: Add new function space() 2023-05-22 14:52:51 +00:00
Igor Nikonov
48ad2896c7 Remove segments from candidates as soon as handle them 2023-05-22 14:13:37 +00:00
avogar
ea59d2ec5d Allow custom cleanup function 2023-05-22 14:06:46 +00:00
Alexander Tokmakov
821b64b420 apply review suggestions 2023-05-22 15:18:29 +02:00
Kruglov Pavel
054ffc47b7
Merge branch 'master' into fiber-local-var-2 2023-05-22 15:17:45 +02:00
kssenii
6bfbbc94bf A little better 2023-05-22 15:04:27 +02:00
Kruglov Pavel
b5cad024e0
Merge branch 'master' into urlCluster 2023-05-22 14:59:34 +02:00
avogar
ef09ed7117 Fix assert in SpanHolder::finish() with fibers attempt 2 2023-05-22 12:35:53 +00:00
alesapin
cc3897a84a
Merge pull request #50033 from kssenii/disk-object-storage-minor-changes
Get rid of indirect write buffer in object storages
2023-05-22 14:08:17 +02:00
Alexander Tokmakov
876490ff40
Merge pull request #50065 from azat/dict/load-factor-range-fix
Fix hashed/sparse_hashed dictionaries max_load_factor upper range
2023-05-22 15:04:56 +03:00
Alexander Tokmakov
487e510103 Merge branch 'master' into less_zookeeper_requests 2023-05-22 13:59:26 +02:00
Igor Nikonov
d27b88538d
Fix grammar
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2023-05-22 13:41:50 +02:00
Sergei Trifonov
0f90bfdd11
Merge pull request #50051 from ClickHouse/add-more-distributed-connection-profile-events
Add more profile events for distributed connections
2023-05-22 13:37:56 +02:00
Sergei Trifonov
7fbfa4b21e
Merge branch 'master' into async-loader-workloads 2023-05-22 13:31:42 +02:00
Sergei Trifonov
b1d5ef5440
Merge pull request #49995 from azat/dict/server-memory
Charge only server memory for dictionaries
2023-05-22 13:27:13 +02:00
Kruglov Pavel
e7c0b75a5a
Merge pull request #50001 from ClickHouse/compliant
Add setting output_format_parquet_compliant_nested_types to produce more compatible Parquet files
2023-05-22 13:20:51 +02:00
Alexander Tokmakov
c89f92e1f6
Merge pull request #50052 from amosbird/fix_49913
Fix reporting broken projection parts
2023-05-22 14:20:06 +03:00
Alexander Tokmakov
8dbf7beb32
Merge pull request #50015 from azat/ddl-initial-query-id
Preserve initial_query_id for ON CLUSTER queries
2023-05-22 14:03:45 +03:00
kssenii
fa479a870b Merge remote-tracking branch 'upstream/master' into fix-logical-error-in-try-reserve 2023-05-22 13:02:25 +02:00
kssenii
2d0ebba67f Better 2023-05-22 12:58:56 +02:00
Nikolai Kochetov
dc22e90d2d
Merge pull request #49508 from ClickHouse/fix-tracecollector
Fix bug in TraceCollector destructor.
2023-05-22 12:28:17 +02:00
Maksim Kita
804e5e12ba JIT compilation not equals NaN fix 2023-05-22 13:14:27 +03:00
Igor Nikonov
85893b1a0b Clarify dropping removal_candidate flag with comment 2023-05-22 09:49:16 +00:00
Alexander Tokmakov
27e12fb5c9
Merge pull request #50058 from azat/replicated-db-fix-crash
Fix crashing in case of Replicated database without arguments
2023-05-22 12:46:36 +03:00
lgbo-ustc
0573d79ff9 update 2023-05-22 17:45:18 +08:00
lgbo-ustc
f33f1e4840 roll back 2023-05-22 17:45:18 +08:00
lgbo-ustc
4d24b645f0 fixed: alloca new hash table when bytes is oveflow 2023-05-22 17:44:27 +08:00
lgbo-ustc
02b04fd9bf reuse previous hash table's space directly 2023-05-22 17:43:45 +08:00
lgbo-ustc
826aa8021a fixed: unnecessary hash table allocation 2023-05-22 17:42:38 +08:00
lgbo-ustc
8bc4a3b2c0 try to reserve hash table size 2023-05-22 17:42:38 +08:00
Victor Krasnov
98aace14ae Add DATE_SECONDS_PER_DAY macro definition to replace the numeric literal 86400 2023-05-22 09:23:23 +00:00
vdimir
fa93c388b1
Merge pull request #49483 from bigo-sg/grace_hash_full_join 2023-05-22 11:16:08 +02:00
Han Fei
a2c0a65344
Merge pull request #49919 from hanfei1991/hanfei/fix-optimize-regexp-prefix
fix `is_prefix` in OptimizeRegularExpression
2023-05-22 10:50:36 +02:00
Han Fei
a257ff6cf3 address comment 2023-05-22 10:41:22 +02:00
Antonio Andelic
3a46fe1803
Merge branch 'master' into write-buffer-from-s3 2023-05-22 09:28:14 +02:00
Antonio Andelic
b20ce5309f
Merge pull request #50020 from ClickHouse/keeper-check-ftruncate
Check return value of `ftruncate` in Keeper
2023-05-22 09:27:02 +02:00
Antonio Andelic
93cd47fd5f
Merge pull request #50026 from ClickHouse/fix-deadlock-attach-thread
Avoid deadlock when starting table in attach thread of `ReplicatedMergeTree`
2023-05-22 09:22:33 +02:00
Azat Khuzhin
c30658a9ed Fix hashed/sparse_hashed dictionaries max_load_factor upper range
Previously due to comparison of floats with doubles, it was incorrectly
works for the upper range:

    (lldb) p (float)0.99 > (float)0.99
    (bool) $0 = false
    (lldb) p (float)0.99 > (double)0.99
    (bool) $1 = true

This should also fix performance tests errors on CI:

    clickhouse_driver.errors.ServerException: Code: 36.
    DB::Exception: default.simple_key_HASHED_dictionary_l0_99: max_load_factor parameter should be within [0.5, 0.99], got 0.99. Stack trace:

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-22 08:59:48 +02:00
SmitaRKulkarni
bfb5ab8d73
Merge branch 'master' into parametrized-views-multi-use 2023-05-22 08:56:06 +02:00
lgbo-ustc
35d8388705 update 2023-05-22 10:17:41 +08:00
lgbo-ustc
2582399271 fixed: a in-memory bucket contains rows of other buckets 2023-05-22 10:17:41 +08:00
lgbo-ustc
603c024eb0 ensure only the last processor could access non-joined blocks 2023-05-22 10:17:41 +08:00
lgbo-ustc
29ade23397 fixed: return invalid mismatch rows on full/right join 2023-05-22 10:17:41 +08:00
lgbo-ustc
80af345ea6 update 2023-05-22 10:17:41 +08:00
lgbo-ustc
d5efc0e688 update 2023-05-22 10:17:41 +08:00
lgbo-ustc
8efec9bcca add locks for getNonJoinedBlocks 2023-05-22 10:17:41 +08:00
lgbo-ustc
89dd538bea update 2023-05-22 10:17:41 +08:00
lgbo-ustc
5c44e6a562 triger ci 2023-05-22 10:17:40 +08:00
lgbo-ustc
d89beb1bf7 update tests 2023-05-22 10:17:40 +08:00
lgbo-ustc
7772fed161 update
1. fixed the memoery overflow problem when  handle all delayed buckets parallely
2. resue exists tests
2023-05-22 10:17:40 +08:00
lgbo-ustc
39db0f84d9 add comment 2023-05-22 10:17:40 +08:00
lgbo-ustc
39ff030a6e grace hash join supports right/full join 2023-05-22 10:17:40 +08:00
Igor Nikonov
f5dc07d052 tryReserve() cleanup
simplify removing eviction candidates
2023-05-21 22:01:28 +00:00
Azat Khuzhin
ef06bb8f14 Fix crashing in case of Replicated database without arguments
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-21 23:12:39 +02:00
Azat Khuzhin
66cf16410d Preserve initial_query_id for ON CLUSTER queries
v2: add proper escaping
v3: set distributed_ddl_output_mode=none for test to fix replicated database build
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-21 23:04:54 +02:00
Azat Khuzhin
b6cc504717 Remove Common/OpenTelemetryTraceContext.h from Context.h
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-21 23:04:33 +02:00
Azat Khuzhin
0586a27432 Charge only server memory for dictionaries
Right now the memory will be counted for query/user for dictionary, but
only if it load by user (via SYSTEM RELOAD QUERY or via dictGet()), but
it could be also loaded in backgrounad (due to lifetime, or
update_field, so it is like Buffer, only server memory should be
charged.

v2: mark test as long
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Co-authored-by: Sergei Trifonov <svtrifonov@gmail.com>
2023-05-21 22:53:52 +02:00
serxa
372601d6df fix 2023-05-21 17:04:49 +00:00
serxa
cd11c25864 fix test + more testing for dynamic prioritization 2023-05-21 17:04:00 +00:00
serxa
c5765e71f9 requeue jobs w/o allocations and spawn workers during prioritization 2023-05-21 17:02:56 +00:00
Sergei Trifonov
e33455ed87
Merge branch 'master' into async-loader-workloads 2023-05-21 16:30:58 +02:00
serxa
128b8e5889 fix tests + add test for dynamic pools 2023-05-21 14:28:16 +00:00
Robert Schulze
2a9ff30a7f
Merge pull request #49380 from azat/dict/hashed-memory
Improve memory usage and speed of SPARSE_HASHED/HASHED dictionaries
2023-05-21 15:46:41 +02:00
Amos Bird
0a3d986e42
Fix reporting projection broken part 2023-05-21 20:58:58 +08:00
serxa
44b1754ccf more profile events 2023-05-21 12:43:47 +00:00
serxa
c56e6a8b80 Add more profile events for distributconnections 2023-05-21 12:15:06 +00:00
Sergei Trifonov
3c002755e2
Merge pull request #50036 from ClickHouse/fix-load-balancing
Load balancing bugfixes
2023-05-21 11:21:55 +02:00
kssenii
8924c17575 Fix build 2023-05-20 13:31:27 +02:00
vdimir
8b77e2096c
Merge pull request #49760 from arthurpassos/extract_kv_ignore_kv_delimiter_when_reading_value 2023-05-20 13:27:59 +02:00
Azat Khuzhin
82054d40a5 Add proper escaping for DDL OpenTelemetry context serialization
Before you was able to break the format by using "\n" or "\t", that will
simply lead to DDL hang, because DDLWorker will simply log the error and
do nothing more:

    <Error> DDLWorker: Cannot parse DDL task query-0000000056: Incorrect task format. Will try to send error status: Code: 27. DB::ParsingException: Cannot parse input: expected '\n' before: 'bar\n1\n'. (CANNOT_PARSE_INPUT_ASSERTION_FAILED) (version 23.5.1.1)

Fix this by adding proper escaping.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-20 13:15:49 +02:00
Azat Khuzhin
52c5fd5cb9 Rewrite OpenTelemetry context serialization for DDL without IO/Operators.h
This is required to switch to escaped versions.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-20 13:14:40 +02:00
Kseniia Sumarokova
a0480daef3
Update waitServersToFinish.h 2023-05-20 12:43:24 +02:00
Igor Nikonov
fbcbd3ab90
Merge pull request #49846 from ClickHouse/clearable_hash_set_without_zero_storage
Clearable hash table and zero values
2023-05-20 11:19:44 +02:00
Azat Khuzhin
7189481fad Preserve backward incompatibility for renamed settings by using aliases
- optimize_use_projections/allow_experimental_projection_optimization
- enable_lightweight_delete/allow_experimental_lightweight_delete

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-20 09:07:41 +02:00
LiuYangkuan
6e80537ab6 support passing fqdn to register cluster node in keeper 2023-05-20 12:41:48 +08:00
Alexey Milovidov
4e3188126f
Merge pull request #49050 from FFFFFFFHHHHHHH/dot_product
Add Function dotProduct for array
2023-05-20 03:07:13 +03:00
Alexey Milovidov
2323542e47
Merge pull request #50022 from ClickHouse/geo-types-production-ready
Geo types are production ready
2023-05-20 02:02:23 +03:00
Alexey Milovidov
54f7b8e6ab
Merge pull request #50030 from kssenii/aws-client-save-provider
Add method getCredentials() to S3::Client
2023-05-20 01:59:58 +03:00
Nikita Taranov
c93836b962 fix 2023-05-19 22:27:53 +00:00
Igor Nikonov
af80e29519
Merge branch 'master' into clearable_hash_set_without_zero_storage 2023-05-19 23:36:30 +02:00
Sergei Trifonov
14e8132ac4
Merge branch 'master' into fix-load-balancing 2023-05-19 23:05:27 +02:00
Michael Kolupaev
6fd5d8e8ba Add setting output_format_parquet_compliant_nested_types to produce more compatible Parquet files 2023-05-19 18:39:50 +00:00
alekar
de710209a7
Merge branch 'master' into fix-osx-setsockopt-errors 2023-05-19 11:15:01 -07:00
serxa
052d8aca71 limit max_tries value by max_error_cap to avoid unlimited number of retries 2023-05-19 18:13:29 +00:00
serxa
d69c35fcdd fix PoolWithFailover error_count integer overflow 2023-05-19 17:57:00 +00:00
serxa
086888b285 fix ConnectionPoolWithFailover::getPriority 2023-05-19 17:54:29 +00:00
serxa
35e77f8e2a fix comment 2023-05-19 17:53:22 +00:00
kssenii
791bb6cd4c Fix style check 2023-05-19 17:35:01 +02:00
kssenii
3e42ee7f2b Get rid of finalize callback in object storages 2023-05-19 17:29:37 +02:00
Antonio Andelic
4af8187464 Activate restarting thread in both cases 2023-05-19 15:06:02 +00:00
kssenii
b29edc4737 Add method 2023-05-19 16:38:14 +02:00
kssenii
0eab528f9f Move common code 2023-05-19 16:23:56 +02:00
mateng915
5237dd0245
New system table zookeeper connection (#45245)
* Feature: Support new system table to show which zookeeper node be connected

Description:
============
Currently we have no place to check which zk node be connected otherwise using
lsof command. It not convenient

Solution:
=========
Implemented a new system table, system.zookeeper_host when CK Server has zk
this table will show the zk node dir which connected by current CK server

Noted: This table can support multi-zookeeper cluster scenario.

* fixed review comments

* added test case

* update test cases

* remove unused code

* fixed review comments and removed unused code

* updated test cases for print host, port and is_expired

* modify the code comments

* fixed CI Failed

* fixed code style check failure

* updated test cases by added Tags

* update test reference

* update test cases

* added system.zookeeper_connection doc

* Update docs/en/operations/system-tables/zookeeper_connection.md

* Update docs/en/operations/system-tables/zookeeper_connection.md

* Update docs/en/operations/system-tables/zookeeper_connection.md

---------

Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>
2023-05-19 17:06:43 +03:00
Sergei Trifonov
67bf9ac539
Merge pull request #49797 from azat/fix-throttlers
Fix per-query IO/BACKUPs throttling settings
2023-05-19 15:51:57 +02:00
Antonio Andelic
acf71c5b9a
Fix typo 2023-05-19 15:48:31 +02:00
Antonio Andelic
7f60af11cb
Merge branch 'master' into write-buffer-from-s3 2023-05-19 15:17:05 +02:00
Antonio Andelic
3107070e76 Avoid deadlock when starting table in attach thread 2023-05-19 12:48:19 +00:00
Dmitry Novik
d705e5102b
Merge pull request #49838 from ClickHouse/group-by-constant-fix
Analyzer: do not optimize GROUP BY keys with ROLLUP and CUBE
2023-05-19 14:27:34 +02:00
kssenii
3121a57912 Add some assertions 2023-05-19 14:21:07 +02:00
Sergei Trifonov
5db5f6e44b
Merge branch 'master' into fix-throttlers 2023-05-19 14:08:36 +02:00
Alexey Milovidov
ab162756ba
Merge branch 'master' into dot_product 2023-05-19 14:46:53 +03:00
Nikolay Degterinsky
a09a8d60c4
Merge branch 'master' into postgresql-uuid 2023-05-19 13:44:17 +02:00
Antonio Andelic
9c3b17fa18
Remove whitespace 2023-05-19 13:00:51 +02:00
alesapin
632ab8a3d1
Merge pull request #49996 from ClickHouse/az
Fix test_insert_same_partition_and_merge failing if one Azure request attempt fails
2023-05-19 12:58:47 +02:00
Antonio Andelic
e46476dba2
Update src/Coordination/Changelog.cpp
Co-authored-by: alesapin <alesapin@clickhouse.com>
2023-05-19 12:44:20 +02:00
Alexey Milovidov
f5506210d6 Geo types are production ready 2023-05-19 12:43:55 +02:00
alesapin
e741450b88
Merge branch 'master' into fix_another_zero_copy_bug 2023-05-19 12:40:48 +02:00
alesapin
e5b001abda
Merge branch 'master' into fix_some_tests4 2023-05-19 12:34:03 +02:00
kssenii
d6df009842 Fix 2023-05-19 12:22:46 +02:00
Antonio Andelic
6e468b29e8 Check return value of ftruncate 2023-05-19 10:15:06 +00:00
Alexey Milovidov
70c83f5133
Merge pull request #49991 from amosbird/clickhouse_as_library
Use PROJECT_*_DIR instead of CMAKE_*_DIR.
2023-05-19 12:37:18 +03:00
Azat Khuzhin
e1e2a83a9e Print type of the structure that will be used for HASHED/SPARSE_HASHED
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
f8e7d2cb1f Remove part of the HashTableGrowerWithPrecalculationAndMaxLoadFactor comment
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
c9cde110cd Add initial degree as parameter for HashTableGrowerWithPrecalculationAndMaxLoadFactor
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
01bf041cca Rewrite HashTableGrower{,WithPrecalculation}::set w/o ternary operators
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
634f168a74 Introduce max_size_degree for HashTableGrower{,WithPrecalculation}
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
42eac6bfbc Wrap implementation helpers into HashedDictionaryImpl namespace
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
6f351851ad Rename grower to HashTableGrowerWithPrecalculationAndMaxLoadFactor
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
1ab130132c Add more comments into HashedDictionaryCollectionType.h
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
7eba6def94 Add a comment for HashTableGrowerWithPrecalculation about load factor
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
422cbe08fe Do not use PackedHashMap for non-POD for the purposes of layout
In clang-16 the behaviour for POD types had been changed in [1], this
does not allows us to use PackedHashMap for some types.

  [1]: 277123376c

Note, that I tried to come up with a more generic solution then
enumeratic types, but failed. Though now I think that this is good,
since this shows which types are not allowed for PackedHashMap

Another option is to use -fclang-abi-compat=13.0 but I doubt it is a
good idea.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
fc19e79f50 Change coding style of declaring packed attribute in PackedHashMap
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
65dd87d0da Fix "reference binding to misaligned address" in PackedHashMap
Use separate helpers that accept/return values, instead of reference,
anyway PackedHashMap is developed for small structure.

v0: fix for keys
v2: fix for values
v3: fix bitEquals
v4: fix for iterating over HashMap
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
7c8d8eeb56 Use Cell::setMapped() over separate helper insertSetMapped()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
2996b38606 Add ability to configure maximum load factor for the HASHED/SPARSE_HASHED layout
As it turns out, HashMap/PackedHashMap works great even with max load
factor of 0.99. By "great" I mean it least it works faster then
google sparsehash, and not to mention it's friendliness to the memory
allocator (it has zero fragmentation since it works with a continuious
memory region, in comparison to the sparsehash that doing lots of
realloc, which jemalloc does not like, due to it's slabs).

Here is a table of different setups:

settings                         | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
-                                | -          | -          | -                     | -               | -
HASHED upstream                  | -          | -          | -                     | -               | 35GiB
SPARSE_HASHED upstream           | -          | -          | -                     | -               | 26GiB
-                                | -          | -          | -                     | -               | -
sparse_hash_map glibc hashbench  | -          | -          | -                     | -               | 17.5GiB
sparse_hash_map packed allocator | 101.878    | 231.48     | 4.32                  | -               | 17.7GiB
PackedHashMap 0.5                | 15.514     | 42.35      | 23.61                 | 20GiB           | 22GiB
hashed 0.95                      | 34.903     | 115.615    | 8.65                  | 16GiB           | 18.7GiB
**PackedHashMap 0.95**           | **93.6**   | **19.883** | **10.68**             | **10GiB**       | **12.8GiB**
PackedHashMap 0.99               | 26.113     | 83.6       | 11.96                 | 10GiB           | 12.3GiB

As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less
memory then SPARSE_HASHED in upstream, and it also 2x faster for read!

v2: fix grower
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
3698302ddb Accept float values for dictionary layouts configurations
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
8c6d691f52 Use HashTable constructor in HashSet
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
fb6f7631c2 Add ability to pass grower for HashTable during creation
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
7b5d156cc5 Optimize SPARSE_HASHED layout (by using PackedHashMap)
In case you want dictionary optimized for memory, SPARSE_HASHED is not
always gives you what you need.

Consider the following example <UInt64, UInt16> as <Key, Value>, but
this pair will also have a 6 byte padding (on amd64), so this is almost
40% of space wastage.

And because of this padding, even google::sparse_hash_map, does not make
picture better, in fact, sparse_hash_map is not very friendly to memory
allocators (especially jemalloc).

Here are some numbers for dictionary with 1e9 elements and UInt64 as
key, and UInt16 as value:

settings                         | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
HASHED upstream                  | -          | -          | -                     | -               | 35GiB
SPARSE_HASHED upstream           | -          | -          | -                     | -               | 26GiB
-                                | -          | -          | -                     | -               | -
sparse_hash_map glibc hashbench  | -          | -          | -                     | -               | 17.5GiB
sparse_hash_map packed allocator | 101.878    | 231.48     | 4.32                  | -               | 17.7GiB
PackedHashMap                    | 15.514     | 42.35      | 23.61                 | 20GiB           | 22GiB

As you can see PackedHashMap looks way more better then HASHED, and
even better then SPARSE_HASHED, but slightly worse then sparse_hash_map
with packed allocator (it is done with a custom patch to google
sparse_hash_map).

v2: rebase on top of bucket_count fix
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
b44497fd4c Introduce PackedHashMap (HashMap with structure without padding)
In case of you have HashMap with <UInt64, UInt16> as <Key, Value> the
overhead of 38% can be crutial, especially if you have tons of keys.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
c4f23e87f1 Export grower_type in HashTable
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Michael Kolupaev
e84f0895e7 Support hardlinking parts transactionally 2023-05-18 21:05:56 -07:00
Yakov Olkhovskiy
a2c3de5082
Merge pull request #49933 from ClickHouse/fix-ipv6-proto-serialization
Fix IPv6 encoding in protobuf
2023-05-18 23:02:15 -04:00
Nikolay Degterinsky
ef45956713 Fix style 2023-05-19 01:31:45 +00:00
Nikolay Degterinsky
b8be714830 Add schema inference to more table engines 2023-05-19 00:44:27 +00:00
Dmitry Novik
aea71cf1bb
Merge branch 'master' into group-by-constant-fix 2023-05-19 01:29:56 +02:00
Victor Krasnov
3a3e413552 Implement toLastDayWeek function 2023-05-18 21:47:52 +00:00
Michael Kolupaev
8dc59c1efe Fix test_insert_same_partition_and_merge failing if one Azure request attempt fails 2023-05-18 21:40:24 +00:00
Amos Bird
6b4dcbd3ed
Use PROJECT_*_DIR instead of CMAKE_*_DIR. 2023-05-18 23:23:39 +08:00
Nikita Taranov
971cc092d4
Update src/Storages/MergeTree/MergeTreePrefetchedReadPool.cpp 2023-05-18 15:16:47 +02:00
Sergei Trifonov
f98c337d2f
Fix stack-use-after-scope in resource manager test (#49908)
* Fix stack-use-after-scope in resource manager test

* fix
2023-05-18 14:53:46 +02:00
Kseniia Sumarokova
adebac1a92
Merge branch 'master' into fix-assertion-in-do-cleanup 2023-05-18 12:22:02 +02:00
Victor Krasnov
83d066e5cf Re-enable Date and Date32 as parameters of toUnixTimestamp function 2023-05-18 09:07:27 +00:00
FFFFFFFHHHHHHH
d31371adac
Merge branch 'master' into dot_product 2023-05-18 15:31:25 +08:00
SmitaRKulkarni
a91c793684
Merge branch 'master' into 42192_Lower_parallel_parsing_threads_with_globs 2023-05-18 09:24:25 +02:00
Alexey Gerasimchuk
e44263d101
Merge branch 'master' into ADQM-808 2023-05-18 17:08:25 +10:00
Alexey Milovidov
86e14547d4
Merge pull request #49964 from ClickHouse/kssenii-patch-7
Follow up to #49429
2023-05-18 09:20:00 +03:00
Alexey Gerasimchuk
1fb9e36b81
Merge branch 'master' into ADQM-808 2023-05-18 07:59:02 +10:00
Kseniia Sumarokova
855c95f626
Update src/Interpreters/Cache/Metadata.cpp
Co-authored-by: Igor Nikonov <954088+devcrafter@users.noreply.github.com>
2023-05-17 22:46:09 +02:00
Azat Khuzhin
e2e3a03dbe
Revert "groupArray returns cannot be nullable" 2023-05-17 22:33:30 +02:00
Timur Solodovnikov
c7ab59302f
Set allow_experimental_query_cache setting as obsolete (#49934)
* set allow_experimental_query_cache as obsolete

* add tsolodov to trusted contributors

* CI linter

---------

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-05-17 20:03:42 +02:00
Kseniia Sumarokova
1c04085e8f
Update MergeTreeWriteAheadLog.h 2023-05-17 18:15:51 +02:00
kssenii
f2dbcb5146 Better fix 2023-05-17 16:27:06 +02:00
Han Fei
ed1d036151
Merge pull request #49884 from azat/dist-fix-async-block-processing
Fix processing pending batch for Distributed async INSERT after restart
2023-05-17 15:19:42 +02:00
Nikolay Degterinsky
194ce2d881 Better 2023-05-17 13:13:57 +00:00
avogar
7443dc925c Fix possible Logical error on bad Nullable parsing for text formats 2023-05-17 13:12:00 +00:00
Nikita Taranov
0dd67bacf2
Merge branch 'master' into optimize_reading2 2023-05-17 15:06:41 +02:00
AVMusorin
7df4820af7
Fix metrics WriteBufferFromS3Bytes, WriteBufferFromS3Microseconds and WriteBufferFromS3RequestsErrors
Ref: https://github.com/ClickHouse/ClickHouse/pull/45188
2023-05-17 14:50:38 +02:00
avogar
2ff3c8badd Remove testing code 2023-05-17 11:41:00 +00:00
avogar
846804fed0 Add separate handshake_timeout for receiving Hello packet from replica 2023-05-17 11:39:04 +00:00
Alexander Tokmakov
36c31e1d79
Improve concurrent parts removal with zero copy replication (#49630)
* improve concurrent parts removal

* fix

* fix
2023-05-17 14:07:34 +03:00
Alexander Tokmakov
1e529263d0
Merge branch 'master' into Follow_up_Backup_Restore_concurrency_check_node_2 2023-05-17 13:57:50 +03:00
Vitaly Baranov
6c8a923c9d
Merge branch 'master' into write-encrypted-to-backup 2023-05-17 12:37:05 +02:00
Kseniia Sumarokova
edceda494d
Merge branch 'master' into add-more-logging-for-cache 2023-05-17 12:24:59 +02:00
Kseniia Sumarokova
3787b7f127
Update Metadata.cpp 2023-05-17 12:16:18 +02:00
Azat Khuzhin
fdfb1eda55 Fix {Local,Remote}ReadThrottlerSleepMicroseconds metric values
And also update the test, since now you could have slightly less sleep
intervals, if query spend some time in other places.

But what is important is that query_duration_ms does not exceeded
calculated delay.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-17 12:12:39 +02:00
Azat Khuzhin
7383da0c52 Fix per-query remote throttler
remote throttler by some reason had been overwritten by the global one
during reloads, likely this is for graceful reload of this option, but
it breaks per-query throttling, remove this logic.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-17 12:12:39 +02:00
Azat Khuzhin
3c80e30f02 Fix per-query IO/BACKUPs throttling settings (when default profile has them)
When some of this settings was set for default profile (in
users.xml/users.yml), then it will be always used regardless of what
user passed.

Fix this by not inherit per-query throttlers, for this they should be
reset before making query context and they should not be initialized as
before in Context::makeQueryContext(), since makeQueryContext() called
too early, when user settings was not read yet.

But there we had also initialization of per-server throttling, move this
into the ContextSharedPart::configureServerWideThrottling(), and call it
once we have ServerSettings set.

Also note, that this patch makes the following settings - server
settings:
- max_replicated_fetches_network_bandwidth_for_server
- max_replicated_sends_network_bandwidth_for_server
But this change should not affect anybody, since it is done with
compatiblity (i.e. if this setting is set in users profile it will be
read from it as well as a fallback).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-17 12:12:39 +02:00
Igor Nikonov
7d647c50c7
Merge branch 'master' into clearable_hash_set_without_zero_storage 2023-05-17 11:29:01 +02:00
FFFFFFFHHHHHHH
fd1e6557e1
Merge branch 'master' into dot_product 2023-05-17 14:40:06 +08:00
fhbai
c104354894 fix 2023-05-17 14:39:30 +08:00
Alexey Gerasimchuck
29b10ae336 reordered options 2023-05-17 04:06:01 +00:00
Alexey Gerasimchuck
4a6c7254e8 --multiquery <sql> -> -n -q <sql> syntax sugar 2023-05-17 03:43:35 +00:00
Vitaly Baranov
f4ac4c3f9d Corrections after review. 2023-05-17 03:23:16 +02:00
Yakov Olkhovskiy
0a44a69dc8 remove unnecessary header 2023-05-17 00:22:13 +00:00
Yakov Olkhovskiy
282297b677 binary encoding of IPv6 in protobuf 2023-05-16 23:46:01 +00:00
Han Fei
3ead9e627e Merge branch 'master' into hanfei/fix-optimize-regexp-prefix 2023-05-16 22:31:01 +02:00
serxa
abacf1f990 add missing quota_key in operator== for connections 2023-05-16 19:14:54 +00:00
serxa
b12eefc694 fix timeout units and log message 2023-05-16 18:57:04 +00:00
Alexander Tokmakov
242a3fc520 Merge branch 'master' into less_zookeeper_requests 2023-05-16 18:24:11 +02:00
Alexander Tokmakov
0da82945ac fix 2023-05-16 18:18:48 +02:00
Alexander Tokmakov
3d26232cc0
Merge pull request #49918 from ClickHouse/remove_unused_code
Remove unused code
2023-05-16 18:53:49 +03:00
kssenii
724949927b Add logging 2023-05-16 17:36:48 +02:00
Antonio Andelic
4bc5a76fa7
Add Compose request for GCS (#49693)
* Add compose request

* Check if outcome is successful

---------

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-05-16 17:20:06 +02:00
Dmitry Novik
2287dd8633
Merge pull request #49800 from ClickHouse/fix-adding-cast
Analyzer: apply _CAST to constants only once
2023-05-16 17:05:02 +02:00
Igor Nikonov
dea5cbcf4e
Slightly update comment 2023-05-16 16:39:00 +02:00
vdimir
1f55c320b4 Fix style 2023-05-16 16:23:53 +02:00
vdimir
ca005ecea1 Update comment about filtering nulls in asof join 2023-05-16 16:23:53 +02:00
vdimir
a7bb8f412f Allow ASOF JOIN over nullable right column 2023-05-16 16:23:53 +02:00
Kruglov Pavel
4530f38fdf
Merge branch 'master' into urlCluster 2023-05-16 16:21:23 +02:00
Kruglov Pavel
d50e6fe868
Fix build after bad conflicts resolution 2023-05-16 15:35:16 +02:00
alesapin
50a536bba8 Remove unused code 2023-05-16 15:26:24 +02:00
Han Fei
ea59761809 fix OptimizeRegularExpression 2023-05-16 15:25:04 +02:00
Azat Khuzhin
68138395eb Fix parameterized views when query parameter used multiple times in the query
Example:

    CREATE VIEW view AS
    SELECT *
    FROM system.one
    WHERE dummy = {k1:Int}+1 OR dummy = {k1:Int}+2
                   ^^                    ^^

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-16 15:13:21 +02:00
Alexander Tokmakov
b6716a8f0f Merge branch 'master' into fix_some_tests4 2023-05-16 14:46:27 +02:00
Vitaly Baranov
b068f0b619 Fix build. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
2ec94a42b7 Remove default parameters from virtual functions. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
943707963f Add backup setting "decrypt_files_from_encrypted_disks" 2023-05-16 14:27:27 +02:00
Vitaly Baranov
019493efa3 Fix throttling in backups. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
5198997fd8 Remove ReadSettings from backup entries. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
7cea264230 Fix whitespaces. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
c48c20fac8 Use combined checksums for encrypted immutable files. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
517e119e03 Move checksum calculation to IBackupEntry. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
002fd19cb7 Move the common part of BackupIO_* to BackupIO_Default. 2023-05-16 14:27:23 +02:00
Vitaly Baranov
c92219f01b BACKUP now writes encrypted data for tables on encrypted disks. 2023-05-16 14:26:33 +02:00
Vitaly Baranov
cc50fcc60a Remove the 'temporary_file_' argument from BackupEntryFromImmutableFile's constructor. 2023-05-16 14:25:37 +02:00
Vitaly Baranov
bc880db5d9 Add functions to read/write encrypted files from IDisk. 2023-05-16 14:25:37 +02:00
Vitaly Baranov
101aa6eff0 Add function copyS3FileFromDisk(). 2023-05-16 14:25:37 +02:00
Vitaly Baranov
69114cb550 Add function getBlobPath() to IDisk interface to allow copying to/from disks which are not built on top of IObjectStorage. 2023-05-16 14:25:36 +02:00
Vitaly Baranov
fd2731845c Simplify interface of IBackupWriter: Remove supportNativeCopy() function. 2023-05-16 14:25:36 +02:00
Smita Kulkarni
9a2645a729 Fixed clang build 2023-05-16 14:09:38 +02:00
kssenii
d4ea3ea045 Fix 2023-05-16 13:54:13 +02:00
Kruglov Pavel
8436a093e7
Fix build 2023-05-16 13:36:12 +02:00
Kruglov Pavel
9dbe9507e7
Fix style 2023-05-16 12:55:20 +02:00
Sergei Trifonov
1f9135e4ab
Merge branch 'master' into async-loader-workloads 2023-05-16 12:50:09 +02:00
alesapin
93bd09ddd6
Merge branch 'master' into fix_another_zero_copy_bug 2023-05-16 12:24:52 +02:00
Kruglov Pavel
5ada385502
Merge branch 'master' into allow_empty 2023-05-16 12:21:31 +02:00
avogar
a3dfa40eab Fix 2023-05-16 10:07:21 +00:00
Kruglov Pavel
b6d2a84e83
Try to fix build 2023-05-16 12:01:55 +02:00
Kruglov Pavel
b414760d43
Merge pull request #49673 from Avogar/fiber-local-var
Fix assert in SpanHolder::finish() with fibers
2023-05-16 11:59:33 +02:00
alesapin
0b4ab70dd9
Merge pull request #49891 from hanfei1991/hanfei/chassert-1
use chassert in MergeTreeDeduplicationLog to have better log info
2023-05-16 11:50:11 +02:00