Commit Graph

496 Commits

Author SHA1 Message Date
Maksim Kita
6fb7d44b62 Analyzer support EXPLAIN ESTIMATE 2023-11-09 19:43:14 +03:00
kssenii
f2c0434c4d Merge remote-tracking branch 'origin/master' into minor-improvements-for-s3-queue 2023-11-06 15:51:11 +01:00
kssenii
da21413354 Better shutdown 2023-11-06 15:47:57 +01:00
Duc Canh Le
4c21ba7b6f tables auto initialize new disks without restart
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2023-11-06 10:26:48 +00:00
Azat Khuzhin
c25d6cd624
Rename directory monitor concept into background INSERT (#55978)
* Limit log frequence for "Skipping send data over distributed table" message

After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this
message.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Rename directory monitor concept into async INSERT

Rename the following query settings (with preserving backward
compatiblity, by keeping old name as an alias):
- distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms
- distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms
- distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts
- distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure

Rename the following table settings (with preserving backward
compatiblity, by keeping old name as an alias):
- monitor_batch_inserts -> async_insert_batch
- monitor_split_batch_on_failure -> async_insert_split_batch_on_failure
- directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms
- directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms

And also update all the references:

    $ gg -e directory_monitor_ -e monitor_ tests docs | cut -d: -f1 | sort -u | xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Rename async_insert for Distributed into background_insert

This will avoid amigibuity between general async INSERT's and INSERT
into Distributed, which are indeed background, so new term express it
even better.

Mostly done with:

    $ git di HEAD^ --name-only | xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g'

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Mark 02417_opentelemetry_insert_on_distributed_table as long

CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

---------

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-11-01 15:09:39 +01:00
Nikolai Kochetov
cc3c038394 Fixing test. 2023-10-30 16:29:22 +00:00
Nikolai Kochetov
c5100cd9b1 Fixing build 2023-10-04 09:01:29 +00:00
Nikolai Kochetov
d944b59902 Merge branch 'master' into planner-prepare-filters-for-analysis-2 2023-10-03 14:28:16 +00:00
Nikolai Kochetov
3b757d60a2 Remove commented code. 2023-10-03 14:22:20 +00:00
George Gamezardashvili
0ce30ab6d5
SSH keys authentication (#41109)
Added new type of authentication based on SSH keys. It works only for Native TCP protocol.

Co-authored-by: Nikita Mikhaylov <nikitamikhaylov@clickhouse.com>
Co-authored-by: Robert Schulze <robert@clickhouse.com>
2023-09-26 17:50:19 +02:00
SmitaRKulkarni
d8adf05de2
Added a new column _block_number (#47532)
Added a new virtual column _block_number which is persisted on merges when allow_experimental_block_number_column is enabled
2023-09-20 11:31:12 +02:00
vdimir
1aa18e0eb6
Analyzer: Remove constants from header in StorageDistributed 2023-09-14 16:44:18 +00:00
Nikolai Kochetov
903c966cc8
Merge branch 'master' into planner-prepare-filters-for-analysis-2 2023-09-11 16:14:03 +02:00
Igor Nikonov
2fdc700da2 Style 2023-09-09 13:33:18 +00:00
Igor Nikonov
5470a5b60f Fix style 2023-09-09 06:58:18 +00:00
Igor Nikonov
7a396139df Cleanup: unnecessary SelectQueryInfo usage around distributed 2023-09-08 21:53:38 +00:00
Alexander Tokmakov
d41eca1dcc rename new method 2023-08-28 16:01:00 +02:00
Alexander Tokmakov
9ab545e28c do not wait for flush on shutdown 2023-08-25 19:09:10 +02:00
Nikolai Kochetov
4e4620250d Fixing clang tidy. 2023-08-25 09:51:52 +00:00
Nikolai Kochetov
506c5667a7 Remove comment. 2023-08-24 18:21:41 +00:00
Nikolai Kochetov
9737dbca75 Fix window functions. 2023-08-24 16:37:27 +00:00
Nikolai Kochetov
cb851fcee0 Cleanup. 2023-08-24 11:07:17 +00:00
Nikolai Kochetov
33b8b93d1b Re-implement getOptimizedQueryProcessingStage with analyzer. 2023-08-24 11:07:17 +00:00
Nikolai Kochetov
fc90a1a0bd Fix some skip_unused_shards tests. 2023-08-24 11:07:17 +00:00
Nikolai Kochetov
26e0ad8d72 Re-impl evaluateExpressionOverConstantCondition [part 1] 2023-08-24 11:07:17 +00:00
Azat Khuzhin
17ca2661a1 Add ability to turn off flush of Distributed on DETACH/DROP/server shutdown
Sometimes you can have tons of data there, i.e. few TiBs, and sending
them on server shutdown does not looks sane (maybe there is a bug and
you need to update/restart to fix flushing).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-08-17 08:58:06 +02:00
Dmitry Novik
1244828555 Analyzer: fix virtual columns in StorageDistributed 2023-08-14 19:27:05 +02:00
Alexander Tokmakov
faca49a905 Merge branch 'master' into add_delay_for_replicated 2023-07-24 16:07:38 +02:00
Nikolay Degterinsky
40652cf2de
Merge pull request #51899 from evillique/describe-table-settings
Allow SETTINGS before FORMAT in DESCRIBE TABLE query
2023-07-24 15:24:41 +02:00
alesapin
6416fb6eed
Merge branch 'master' into add_delay_for_replicated 2023-07-22 12:11:39 +02:00
Azat Khuzhin
20625d75ab Fix optimize_skip_unused_shards with JOINs
In case of JOIN query may contains conditions for other tables, while
optimize_skip_unused_shards was pretty dumb and failed to skip such
columns.

Fix this by removing JOIN before applying this optimization.

v2: restriction for analyzer
v3: ignore 01940_custom_tld_sharding_key under analyzer
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Co-Authored-By: Alexey Milovidov <milovidov@clickhouse.com>
2023-07-22 07:45:33 +02:00
robot-clickhouse-ci-2
0db9c79886
Merge pull request #52332 from rschu1ze/better-formatsettings
Minor: Less awkward IAST::FormatSettings
2023-07-21 14:49:58 +02:00
Robert Schulze
25ddcc256b
Make IAST::FormatSettings more regular, pt. III 2023-07-20 10:34:05 +00:00
Dmitry Novik
e8833cffcc Merge remote-tracking branch 'origin/master' into storage-merge-aliases-analyzer 2023-07-17 16:48:07 +00:00
Vitaly Baranov
815a3857de Remove non-const function Context::getClientInfo(). 2023-07-17 15:02:07 +02:00
Nikolay Degterinsky
67e2dee7e2 Allow SETTINGS before FORMAT in DESCRIBE TABLE query 2023-07-06 14:29:58 +00:00
alesapin
baee73fd96 Make shutdown of replicated tables softer 2023-07-05 18:11:25 +02:00
Nikolai Kochetov
22e49748b5 Cleanup. 2023-06-22 14:23:04 +00:00
Nikolai Kochetov
a940031878 Merge branch 'master' into refactor-subqueries-for-in 2023-06-22 12:18:48 +02:00
Dmitry Novik
47fafdc32c Code cleanup 2023-06-21 18:06:24 +00:00
Nikolai Kochetov
afa74f697c Refactor a bit. 2023-06-16 19:38:50 +00:00
Dmitry Novik
d05f89f8f5 Fix style 2023-06-12 17:33:15 +00:00
Dmitry Novik
26c9042ea0 Analyzer: support aliases in StorageMerge 2023-06-12 17:06:52 +00:00
Dmitry Novik
326a3a3e8d Use query tree to rewrite the query 2023-06-12 16:51:40 +00:00
Nikolai Kochetov
6ce8329bda Merge branch 'master' into refactor-subqueries-for-in 2023-06-09 20:04:27 +02:00
Nikolai Kochetov
9a4043a4b4 Fixing more tests. 2023-06-09 17:51:59 +00:00
Dmitry Novik
fd919a288f
Merge pull request #50097 from ClickHouse/analyzer-distr-query
Analyzer: WIP on distributed queries
2023-06-08 11:35:02 +02:00
Antonio Andelic
b11f744252
Correctly disable async insert with deduplication when it's not needed (#50663)
* Correctly disable async insert when it's not used

* Better

* Add comment

* Better

* Fix tests

---------

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-06-07 20:33:08 +02:00
Dmitry Novik
d9a6e36685 Add comments 2023-06-05 11:02:23 +00:00
Dmitry Novik
530f743ed0 Fix Object data type for StorageDistributed 2023-06-02 23:41:25 +02:00
Dmitry Novik
eb7ae91d01 Do not add alias to a temporary table 2023-06-01 14:34:30 +00:00
Dmitry Novik
c6dcb69b85 Fix GLOBAL IN 2023-06-01 14:34:30 +00:00
Dmitry Novik
85e5ed79e5 Fix distributed JOINs 2023-06-01 14:34:30 +00:00
Dmitry Novik
b86516131b Attempt to fix global JOINs and INs 2023-06-01 14:34:30 +00:00
Dmitry Novik
a4cb82127d Analyzer: WIP on distributed queries 2023-06-01 14:34:29 +00:00
Alexey Milovidov
956c399b2a Remove useless code 2023-06-01 03:04:29 +02:00
Nikolai Kochetov
8cec00dd6e Merge branch 'master' into refactor-subqueries-for-in 2023-05-30 18:08:37 +02:00
Anton Popov
612173e734 refactoring near alter conversions 2023-05-25 22:54:54 +00:00
Nikolai Kochetov
b5b261b22c Merge branch 'master' into refactor-subqueries-for-in 2023-05-25 21:22:06 +02:00
Nikita Mikhaylov
1c3b6738f4
Fixes for parallel replicas (#50195) 2023-05-25 14:41:04 +02:00
Nikolai Kochetov
d8f39b8df1 Fixing more tests. 2023-05-24 17:53:37 +00:00
Alexey Milovidov
5a44dc26e7 Fixes for clang-17 2023-05-13 02:57:31 +02:00
Sema Checherinda
f2ad1122a1 fix convertation 2023-05-10 17:50:42 +00:00
Kruglov Pavel
2ad161d2b7
Merge branch 'master' into non-blocking-connect 2023-04-19 13:39:40 +02:00
Yakov Olkhovskiy
35e9e45249
Merge pull request #48062 from Algunenano/unnecessary_alter_checks
Only check MV on ALTER when necessary
2023-04-03 17:23:11 -04:00
Azat Khuzhin
f38a7aeabe ThreadPool metrics introspection
There are lots of thread pools and simple local-vs-global is not enough
already, it is good to know which one in particular uses threads.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-03-29 10:46:59 +02:00
Raúl Marín
d1a6c1991a Only check MV on ALTER when necessary 2023-03-27 17:45:15 +02:00
avogar
38e44861ae Fix possible race conditions 2023-03-21 16:01:54 +00:00
Dmitry Novik
cced9cf613 Fix build 2023-03-14 12:04:39 +00:00
Dmitry Novik
ae3d30a736 Merge remote-tracking branch 'origin/master' into fix-grouping-for-grouping-sets 2023-03-14 12:01:51 +00:00
Alexander Tokmakov
ed08f8f5c5
Merge branch 'master' into revert_25674 2023-03-12 02:33:25 +03:00
Alexander Tokmakov
7b1b238d0b Revert "Merge pull request #25674 from amosbird/distributedreturnconnection"
This reverts commit 5ffd99dfd4, reversing
changes made to 2796aa333f.
2023-03-11 19:09:47 +01:00
Maksim Kita
0358cb36d8 Fixed tests 2023-03-11 11:51:54 +01:00
Maksim Kita
677408e02e Fixed style check 2023-03-11 11:51:54 +01:00
Maksim Kita
a762112e15 Analyzer support distributed JOINS and subqueries in IN functions 2023-03-11 11:51:54 +01:00
Dmitry Novik
2699ef477f Move visitor 2023-03-10 14:36:56 +00:00
Dmitry Novik
a305c6e7ab Fix distributed GROUPING SETS and GROUPING function 2023-03-09 18:00:23 +00:00
Antonio Andelic
35c15e6ef8 Merge branch 'master' into custom-key-parallel-replicas 2023-03-07 09:37:38 +00:00
Han Fei
b7eef62458
Merge pull request #45491 from azat/dist/async-send-refactoring
[RFC] Rewrite distributed sends to avoid using filesystem as a queue, use in-memory queue instead
2023-03-06 12:32:33 +01:00
Antonio Andelic
737cf8e149 Better 2023-03-03 15:14:49 +00:00
Antonio Andelic
01cf9c94f4 Merge branch 'master' into custom-key-parallel-replicas 2023-03-02 14:28:42 +00:00
Maksim Kita
d39be3ac9c Fixed tests 2023-03-01 18:05:07 +01:00
Maksim Kita
51ee007e01 Fixed tests 2023-03-01 18:05:07 +01:00
Azat Khuzhin
e10fb142fd Fix race for distributed sends from disk
Before it was initialized from disk only on startup, but if some INSERT
can create the object before, then, it will lead to the situation when
it will not be initialized.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-28 22:33:36 +01:00
Azat Khuzhin
b5434eac3b Rename StorageDistributedDirectoryMonitor to DistributedAsyncInsertDirectoryQueue
Since #44922 it is not a directory monitor anymore.

v2: Remove unused error codes
v3: Contains some header fixes due to conflicts with master
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-28 22:33:36 +01:00
Azat Khuzhin
3f892e52ab Revert "Revert "Merge pull request #44922 from azat/dist/async-INSERT-metrics""
This is the revert of revert since there will be follow up patches to
address the issues.

This reverts commit a55798626a.
2023-02-28 22:33:36 +01:00
Maksim Kita
cbd961de98 Fixed code review issues 2023-02-18 17:06:00 +01:00
Maksim Kita
05baf271f0 Analyzer fix table functions with invalid arguments analysis 2023-02-16 12:17:04 +01:00
Maksim Kita
6b2adc1ec2 Analyzer storage Merge fixes 2023-02-16 12:17:03 +01:00
Maksim Kita
b1ab2af7ad Analyzer support storage Merge 2023-02-16 12:17:03 +01:00
Maksim Kita
84065fb13f Analyzer added distributed table functions support 2023-02-16 12:17:03 +01:00
Maksim Kita
a090a8449d Analyzer distributed read fix 2023-02-16 12:17:03 +01:00
Maksim Kita
f8442b2a8d Analyzer support LiveView 2023-02-16 12:17:03 +01:00
Maksim Kita
25da9dcef7 StorageDistributed Planner initialization fix 2023-02-16 12:17:02 +01:00
Antonio Andelic
adde580756
Merge branch 'master' into custom-key-parallel-replicas 2023-02-14 14:09:12 +01:00
Antonio Andelic
f67e5505ab Merge branch 'master' into custom-key-parallel-replicas 2023-02-06 11:12:39 +00:00
Robert Schulze
84b9ff450f
Fix terribly broken, fragile and potentially cyclic linking
Sorry for the clickbaity title. This is about static method
ConnectionTimeouts::getHTTPTimeouts(). It was be declared in header
IO/ConnectionTimeouts.h, and defined in header
IO/ConnectionTimeoutsContext.h (!). This is weird and caused issues with
linking on s390x (##45520). There was an attempt to fix some
inconsistencies (#45848) but neither did @Algunenano nor me at first
really understand why the definition is in the header.

Turns out that ConnectionTimeoutsContext.h is only #include'd from
source files which are part of the normal server build BUT NOT part of
the keeper standalone build (which must be enabled via CMake
-DBUILD_STANDALONE_KEEPER=1). This dependency was not documented and as
a result, some misguided workarounds were introduced earlier, e.g.
0341c6c54b

The deeper cause was that getHTTPTimeouts() is passed a "Context". This
class is part of the "dbms" libary which is deliberately not linked by
the standalone build of clickhouse-keeper. The context is only used to
read the settings and the "Settings" class is part of the
clickhouse_common library which is linked by clickhouse-keeper already.

To resolve this mess, this PR

- creates source file IO/ConnectionTimeouts.cpp and moves all
  ConnectionTimeouts definitions into it, including getHTTPTimeouts().

- breaks the wrong dependency by passing "Settings" instead of "Context"
  into getHTTPTimeouts().

- resolves the previous hacks
2023-02-05 20:49:34 +00:00
Han Fei
532b341de9
Merge pull request #45975 from ucasfl/_part
use LowCardnality for _part and _partition_id virtual column
2023-02-05 18:00:46 +01:00
Nikita Mikhaylov
33877b5e00
Parallel replicas. Part [2] (#43772) 2023-02-03 14:34:18 +01:00
flynn
2d1dd694c6 make _table LowCardinality 2023-02-02 16:33:31 +00:00