Commit Graph

194 Commits

Author SHA1 Message Date
Azat Khuzhin
b725e1d131 DistributedBlockOutputStream: Remove superfluous brackets for string construction 2021-01-17 12:48:51 +03:00
Azat Khuzhin
2955e25e83 Fix inserted blocks accounting for insert_distributed_one_random_shard=1
It is tricky due to block splitting

Refs: https://github.com/ClickHouse/ClickHouse/pull/18294
2021-01-17 12:45:42 +03:00
Azat Khuzhin
858f07c796 Update comment for query AST cloning during inesrt into multiple local shards
Refs: https://github.com/ClickHouse/ClickHouse/pull/18264#discussion_r558839456
2021-01-17 12:40:41 +03:00
alexey-milovidov
a5a19de878
Update DistributedBlockOutputStream.cpp 2021-01-16 13:22:25 +03:00
feng lv
9829c09720 fix 2021-01-15 15:54:35 +00:00
Azat Khuzhin
ecae6c1c60 Avoid reading the distributed batch just to read the block header
Before this patch batched mode of the DirectoryMonitor is 2x slower then
non-batched, after it should be more or less the same as non-batched.
2021-01-14 22:38:46 +03:00
Azat Khuzhin
56475774d3 Fix readability-static-definition-in-anonymous-namespace in DirectoryMonitor 2021-01-10 23:57:40 +03:00
Azat Khuzhin
2565d2ac44 Verify compressed headers while sending distributed batches
Before this patch the DirectoryMonitor was checking the compressed file
by reading it one more time (since w/o this receiver may stuck on
truncated file), while this is ineffective and we can just check the
checksums before sending.

But note that this may decrease batch size that is used for sending over
network.
2021-01-10 21:23:42 +03:00
Azat Khuzhin
819b9d7d56 Add more metadata into distributed .bin files to avoid doing the same on sending
Before this patch StorageDistributedDirectoryMonitor reading .bin files
in batch mode, just to calculate number of bytes/rows, this is very
ineffective, let's just store them in the header (rows/bytes).
2021-01-10 18:17:15 +03:00
Azat Khuzhin
fce8b6b5ef Refactoring distributed header parsing 2021-01-10 18:17:15 +03:00
Azat Khuzhin
676bc83c6d Check per-block checksum of the distributed batch on the sender before sending
This is already done for distributed_directory_monitor_batch_inserts=1,
so let's do the same for the non batched mode, since otherwise in case
the file will be truncated the receiver will just stuck (since it will
wait for the block, but the sender will not send it).
2021-01-10 18:17:14 +03:00
Azat Khuzhin
471deab63a Rename fsync_tmp_directory to fsync_directories for Distributed engine 2021-01-09 17:51:30 +03:00
Azat Khuzhin
ae0b15455f Add fsync_tmp_directory support into DirectoryMonitor 2021-01-09 16:31:52 +03:00
Azat Khuzhin
2e55bd2285 Accept IDisk in DirectoryMonitor (for further fsync) 2021-01-09 16:31:42 +03:00
Azat Khuzhin
dd669cb2b6 Add fsync support for Distributed/DirectoryMonitor
Note that there is no fsync_tmp_directory support in DirectoryMonitor
since you cannot propagate the error to user anyway.
2021-01-09 15:26:25 +03:00
Azat Khuzhin
fbe5df809b Sync other temporary directories for Distributed fsync_tmp_directories 2021-01-09 11:36:04 +03:00
Azat Khuzhin
b5ace27014 Add fsync support for Distributed engine.
Two new settings (by analogy with MergeTree family) has been added:

- `fsync_after_insert` - Do fsync for every inserted. Will decreases
  performance of inserts.

- `fsync_tmp_directory` - Do fsync for temporary directory (that is used
  for async INSERT only) after all part operations (writes, renames,
  etc.).

Refs: #17380 (p1)
2021-01-09 11:31:32 +03:00
alexey-milovidov
417e685830
Merge pull request #18775 from ClickHouse/dont-insert-empty-blocks-distributed-sync
Do not insert empty blocks on sync Distributed INSERT
2021-01-06 20:06:35 +03:00
Alexey Milovidov
1572d2122b Respect network_compression_method in async INSERT into Distributed table 2021-01-06 03:41:34 +03:00
Alexey Milovidov
190402b7d5 Do not insert empty blocks on sync Distributed INSERT 2021-01-06 02:54:22 +03:00
Amos Bird
6fc225e676
Distributed insertion to one random shard (#18294)
* Distributed insertion to one random shard

* add some tests

* add some documentation

* Respect shards' weights

* fine locking

Co-authored-by: Ivan Lezhankin <ilezhankin@yandex-team.ru>
2020-12-23 19:04:05 +03:00
Azat Khuzhin
5b3ab48861 More forward declaration for generic headers
The following headers are pretty generic, so use forward declaration as
much as possible:
- Context.h
- Settings.h
- ConnectionTimeouts.h
(Also this shows that some missing some includes -- this has been fixed)

And split ConnectionTimeouts.h into ConnectionTimeoutsContext.h (since
module part cannot be added for it, due to recursive build dependencies
that will be introduced)

Also remove Settings from the RemoteBlockInputStream/RemoteQueryExecutor
and just pass the context, since settings was passed only in speicifc
places, that can allow making a copy of Context (i.e. Copier).

Approx results (How much units will be recompiled after changing file X?):

- ConnectionTimeouts.h
  - mainline: 100

- Context.h:
  - mainline: ~800
  - patched:  415

- Settings.h:
  - mainline: 900-1K
  - patched:  440 (most of them because of the Context.h)
2020-12-12 17:43:10 +03:00
Alexander Tokmakov
bfbf150c67 fix segfault when 'not enough space' 2020-12-02 17:49:43 +03:00
Amos Bird
1d9d586e20
Make global_context consistent. 2020-11-20 18:23:14 +08:00
Alexander Tokmakov
b94cc5c4e5 remove more stringstreams 2020-11-10 21:22:26 +03:00
alexey-milovidov
f39457bc77
Merge pull request #16788 from azat/fix-use_compact_format_in_distributed_parts_names
Apply use_compact_format_in_distributed_parts_names for each INSERT (with internal_replication)
2020-11-08 23:23:10 +03:00
Azat Khuzhin
04db0834bf Apply use_compact_format_in_distributed_parts_names for each INSERT (with internal_replication)
Before this patch use_compact_format_in_distributed_parts_names was
applied only from default profile (at server start) for
internal_replication=1, and was ignored on INSERT.
2020-11-08 03:05:52 +03:00
Alexey Milovidov
fd84d16387 Fix "server failed to start" error 2020-11-07 03:14:53 +03:00
Azat Khuzhin
59cdc964a1 Do not store reference to BackgroundSchedulePool in DirectoryMonitor (useless) 2020-11-05 23:43:34 +03:00
Alexander Kuzmenkov
fb64cf210a straighten the protocol version 2020-09-17 17:37:29 +03:00
Azat Khuzhin
0159c74f21 Secure inter-cluster query execution (with initial_user as current query user) [v3]
Add inter-server cluster secret, it is used for Distributed queries
inside cluster, you can configure in the configuration file:

  <remote_servers>
      <logs>
          <shard>
              <secret>foobar</secret> <!-- empty -- works as before -->
              ...
          </shard>
      </logs>
  </remote_servers>

And this will allow clickhouse to make sure that the query was not
faked, and was issued from the node that knows the secret. And since
trust appeared it can use initial_user for query execution, this will
apply correct *_for_user (since with inter-server secret enabled, the
query will be executed from the same user on the shards as on initator,
unlike "default" user w/o it).

v2: Change user to the initial_user for Distributed queries if secret match
v3: Add Protocol::Cluster package
v4: Drop Protocol::Cluster and use plain Protocol::Hello + user marker
v5: Do not use user from Hello for cluster-secure (superfluous)
2020-09-15 01:36:28 +03:00
Azat Khuzhin
a588947fe2 Fix DistributedFilesToInsert metric (zeroed when it should not)
CurrentMetrics::Increment add amount for specified metric only for the
lifetime of the object, but this is not the intention, since
DistributedFilesToInsert is a gauge and after #10263 it can exit from
the callback (and enter again later, for example after SYSTEM STOP
DISTRIBUTED SEND it will always exit from it, until SYSTEM START
DISTRIBUTED SEND).

So make Increment member of a class (this will also fix possible issues
with substructing value on DROP TABLE).
2020-08-27 00:43:00 +03:00
Alexander Tokmakov
dd4b8b9663 fix lock order inversion when renaming distributed table 2020-08-20 16:36:22 +03:00
Alexey Milovidov
12f66fa82c Fix 99% of typos 2020-08-08 04:01:47 +03:00
Vitaly Baranov
56665a15f7 Rework and rename the template class SettingsCollection => BaseSettings. 2020-07-31 20:54:18 +03:00
Vladimir Chebotarev
faedb04722 Minor fixes. 2020-07-28 19:45:46 +03:00
Vladimir Chebotarev
1b3f5c99f5 Real fix of test. 2020-07-26 21:27:36 +03:00
Alexey Milovidov
73a5c38398 Fix potential overflow in integer division #12119 2020-07-05 03:29:03 +03:00
alesapin
ebb36bec8a Merge branch 'master' into atomic_metadata5 2020-06-18 11:57:16 +03:00
alesapin
1ddeb3d149 Buildable getSampleBlock in StorageInMemoryMetadata 2020-06-16 18:51:29 +03:00
Azat Khuzhin
c139a05370 Forward declaration in StorageDistributed 2020-06-14 01:09:21 +03:00
Azat Khuzhin
d2383f0f5d Fix async INSERT into Distributed for prefer_localhost_replica=0 and w/o internal_replication 2020-06-08 21:58:56 +03:00
Azat Khuzhin
86c5465bf8 Rewrite StorageSystemDistributionQueue interfaces 2020-06-04 03:04:32 +03:00
Azat Khuzhin
f0050adc51 Make system.distribution_queue metrics non racy 2020-06-04 02:36:16 +03:00
Azat Khuzhin
09c3ca9c6c Add last_exception into system.distribution_queue 2020-06-04 02:36:16 +03:00
Azat Khuzhin
389f78ceee Add system.distribution_queue
system.distribution_queue contains the following columns:
- database
- table
- data_path
- is_blocked
- error_count
- data_files
- data_compressed_bytes
2020-06-04 02:36:16 +03:00
alesapin
6253e9b97b Revert disabled tests 2020-06-02 21:41:29 +03:00
Alexey Milovidov
25f941020b Remove namespace pollution 2020-05-31 00:57:37 +03:00
Alexey Milovidov
146370934a Keep the value of DistributedFilesToInsert metric on exceptions 2020-05-27 13:07:38 +03:00
Alexey Milovidov
7e1813825b Return old names of macros 2020-05-24 01:24:01 +03:00
Alexey Milovidov
ce0619dabf Progress on task 2020-05-24 00:26:45 +03:00
Alexey Milovidov
2d7d5a1547 Apply all transformations again 2020-05-24 00:16:27 +03:00
Alexey Milovidov
bab24879e9 Progress on task 2020-05-24 00:16:05 +03:00
Alexey Milovidov
e1695feb7f Apply all transformations again 2020-05-23 23:40:32 +03:00
Alexey Milovidov
85f84550ba Progress on task 2020-05-23 23:37:37 +03:00
Alexey Milovidov
7e2fb9ad65 Apply all transformations again 2020-05-23 22:38:30 +03:00
Alexey Milovidov
eacff92d0e Progress on task 2020-05-23 22:35:08 +03:00
Alexey Milovidov
29762240de Remove duplicate whitespaces (preparation) 2020-05-23 22:31:54 +03:00
Alexey Milovidov
7fed65cbe2 Remove duplicate whitespaces (preparation) 2020-05-23 22:14:58 +03:00
Alexey Milovidov
ab0562a574 Make all LOG in single line (preparation) 2020-05-23 22:05:41 +03:00
Alexey Milovidov
a2ad11897f Remove duplicate whitespaces (preparation) 2020-05-23 21:53:58 +03:00
Alexey Milovidov
1f13515a65 Make all LOG in single line (preparation) 2020-05-23 21:31:37 +03:00
Alexey Milovidov
8042e5febe find {base,src,programs} -name '*.h' -or -name '*.cpp' | xargs grep -l -P 'LOG_\w+\([^,]+, "[^"]+" << [^<]+ << "[^"]+" << [^<]+\);' | xargs sed -i -r -e 's/(LOG_\w+)\(([^,]+), "([^"]+)" << ([^<]+) << "([^"]+)" << ([^<]+)\);/\1_FORMATTED(\2, "\3{}\5{}", \4, \6);/' 2020-05-23 19:58:15 +03:00
Alexey Milovidov
e391b77d81 find {base,src,programs} -name '*.h' -or -name '*.cpp' | xargs grep -l -P 'LOG_\w+\([^,]+, "[^"]+" << [^<]+ << "[^"]+"\);' | xargs sed -i -r -e 's/(LOG_\w+)\(([^,]+), "([^"]+)" << ([^<]+) << "([^"]+)"\);/\1_FORMATTED(\2, "\3{}\5", \4);/' 2020-05-23 19:56:05 +03:00
Alexey Milovidov
8d2e80a5e2 find {base,src,programs} -name '*.h' -or -name '*.cpp' | xargs grep -l -P 'LOG_\w+\([^,]+, "[^"]+"\)' | xargs sed -i -r -e 's/(LOG_\w+)\(([^,]+, "[^"]+")\)/\1_FORMATTED(\2)/' 2020-05-23 19:42:39 +03:00
Azat Khuzhin
d93b9a57f6 Forward declaration for Context as much as possible.
Now after changing Context.h 488 modules will be recompiled instead of 582.
2020-05-21 01:53:18 +03:00
alexey-milovidov
7cf3538840
Merge pull request #10270 from ClickHouse/quota-key-in-client
Support quota_key for Native client
2020-05-17 14:09:40 +03:00
alexey-milovidov
7ee35f102d
Merge pull request #10867 from azat/dist-INSERT-load-balancing
Respect prefer_localhost_replica/load_balancing on INSERT into Distributed
2020-05-17 11:11:35 +03:00
Alexey Milovidov
397859ccb8 Fix error 2020-05-17 08:45:20 +03:00
Azat Khuzhin
2498041cc1 Avoid sending partially written files by the DistributedBlockOutputStream 2020-05-16 01:00:42 +03:00
Azat Khuzhin
52d73c7f45 Fix prefer_localhost_replica=0 and load_balancing for Distributed INSERT 2020-05-14 03:29:03 +03:00
Azat Khuzhin
e1d4837753 Fix list of possible nodes for Distributed INSERT for internal_replication=0 2020-05-14 03:28:08 +03:00
Azat Khuzhin
cdf3845e43 Respect load_balancing in DirectoryMonitor, to fix w/o internal_replication 2020-05-14 01:33:25 +03:00
Azat Khuzhin
085bafad05 Handle prefer_localhost_replica on INSERT into Distributed
Right now it will issue remote send even if finally the local replica
will be selected - not good I guess.

This should also fix load_balancing.
2020-05-13 01:38:03 +03:00
Azat Khuzhin
889f54b549 Fix ENOENT exception on current_batch.txt in DirectoryMonitor
current_batch.txt will not exist if there was no send, this is the case
when all batches that was pending has been marked as pending.
2020-05-13 01:23:18 +03:00
alexey-milovidov
ddc84163a7
Merge pull request #10486 from azat/dist-send-on-INSERT
Fix distributed send that are scheduled by INSERT query
2020-05-11 06:28:35 +03:00
Azat Khuzhin
5c89cdbe61 Fix distributed send retries on distributed_directory_monitor_{max_,}sleep_time_ms > 5min
In this case error_count can be decreased before checking it for
rescheduling send.

And actually this can be a problem not only when
distributed_directory_monitor_{max_,}sleep_time_ms > 5min, because all
threads can be occupated and the real timeout between sends will be > 5min.
2020-05-10 12:37:38 +03:00
Gleb Novikov
c637d99e07 Volumes and storages refactoring:
1. Moved Volume to separate file
  2. Created IVolume interface and implemented current behaviour in implementation of new interface — VolumeJBOD
  3. Replaced all old volume usages with new VolumeJBOD. Where it is unnecessary to have JBOD — left just IVolume.
  4. Removed old Volume completely
  5. Moved StoragePolicy to separated files
  6. Moved DiskSelector to separated files
  7. Removed DiskSpaceMonitor file
2020-05-04 23:15:38 +03:00
Azat Khuzhin
6ffdd53b6a Share auto-increment for first batch and tmp file in DistributedBlockOutputStream 2020-05-03 14:47:59 +03:00
Azat Khuzhin
53c470cab4 Fix directory monitor initialization from INSERT into Distributed
This also fixes hardlink code (when one file should be sent to multiple
servers, i.e. internal_replication == false) of writeToShard() with
distributed_storage_policy (i.e. when StorageDistributed::getPath() will
path to different filesystems).

Plus also cleanup DistributedBlockOutputStream::writeToShard() a little.
2020-05-03 14:47:51 +03:00
Azat Khuzhin
e97e1f06db Do not schedule distributed send if there were no error
Since in this case it will be scheduled from the
DistributedBlockOutputStream with the
distributed_directory_monitor_max_sleep_time_ms, and this will overwrite
timer that was set by the DistributedBlockOutputStream, not good.
2020-05-03 14:46:44 +03:00
Azat Khuzhin
947b3942dd Schedule distributed sends after the file has been written 2020-05-03 14:46:43 +03:00
Azat Khuzhin
0157fd5d93 Fix distributed send that are scheduled by INSERT query
Before this patch each INSERT query re-schedule distributed send, thus
each time it resets the timer, while this is not the expected behaviour,
since in on frequent INSERT distributed sends will not be triggered at
all.

Fix this by not resetting the timer.
2020-05-03 14:46:42 +03:00
Azat Khuzhin
6bb39dafc3 Drop decreated code (cond var and note for thread) in DirectoryMonitor 2020-05-03 14:46:41 +03:00
Azat Khuzhin
63d8ab8f03 Make createSelector() static (in storage) and const (in stream) 2020-05-01 11:31:05 +03:00
Azat Khuzhin
f22ba15b4a Reduce copy-paste of DistributedBlockOutputStream::createSelector
This will make it less error prone.
2020-05-01 02:59:40 +03:00
Alexey Milovidov
1e325a9fd9 Checkpoint 2020-04-22 09:22:14 +03:00
Azat Khuzhin
5d11118cc9 Use thread pool (background_distributed_schedule_pool_size) for distributed sends
After #8756 the problem with 1 thread for each (distributed table, disk)
for distributed sends became even worse (since there can be multiple
disks), so use predefined thread pool for this tasks, that can be
controlled with background_distributed_schedule_pool_size knob.
2020-04-19 12:01:56 +03:00
Azat Khuzhin
673ddc9d77 Drop superfluous locking for atomic in DirectoryMonitor 2020-04-19 00:22:48 +03:00
Alexey Milovidov
8ad04d4fec Remove useless code 2020-04-15 00:05:45 +03:00
Azat Khuzhin
6d85207bfb Convert blocks if structure does not match on INSERT into Distributed()
Follow-up for: #10105
2020-04-08 23:46:01 +03:00
Azat Khuzhin
b2fa9d8750 Fix SIGSEGV on INSERT into Distributed on different struct with underlying 2020-04-08 02:35:31 +03:00
Ivan Lezhankin
06446b4f08 dbms/ → src/ 2020-04-03 18:14:31 +03:00
Azat Khuzhin
f53c9a6b25 Fix "Block structure mismatch" for INSERT into Distributed
Add missing conversion (via ConvertingBlockInputStream) for INSERT into
remote nodes (for sync insert, async insert and async batch insert),
like for local nodes (in DistributedBlockOutputStream::writeBlockConverted).

This is required when the structure of the Distributed table differs
from the structure of the local table.

And also add a warning message, to highlight this in logs (since this
works slower).

Fixes: #19888
2021-02-02 10:16:41 +03:00