Commit Graph

176 Commits

Author SHA1 Message Date
Alexey Milovidov
397859ccb8 Fix error 2020-05-17 08:45:20 +03:00
Azat Khuzhin
2498041cc1 Avoid sending partially written files by the DistributedBlockOutputStream 2020-05-16 01:00:42 +03:00
Azat Khuzhin
52d73c7f45 Fix prefer_localhost_replica=0 and load_balancing for Distributed INSERT 2020-05-14 03:29:03 +03:00
Azat Khuzhin
e1d4837753 Fix list of possible nodes for Distributed INSERT for internal_replication=0 2020-05-14 03:28:08 +03:00
Azat Khuzhin
cdf3845e43 Respect load_balancing in DirectoryMonitor, to fix w/o internal_replication 2020-05-14 01:33:25 +03:00
Azat Khuzhin
085bafad05 Handle prefer_localhost_replica on INSERT into Distributed
Right now it will issue remote send even if finally the local replica
will be selected - not good I guess.

This should also fix load_balancing.
2020-05-13 01:38:03 +03:00
Azat Khuzhin
889f54b549 Fix ENOENT exception on current_batch.txt in DirectoryMonitor
current_batch.txt will not exist if there was no send, this is the case
when all batches that was pending has been marked as pending.
2020-05-13 01:23:18 +03:00
alexey-milovidov
ddc84163a7
Merge pull request #10486 from azat/dist-send-on-INSERT
Fix distributed send that are scheduled by INSERT query
2020-05-11 06:28:35 +03:00
Azat Khuzhin
5c89cdbe61 Fix distributed send retries on distributed_directory_monitor_{max_,}sleep_time_ms > 5min
In this case error_count can be decreased before checking it for
rescheduling send.

And actually this can be a problem not only when
distributed_directory_monitor_{max_,}sleep_time_ms > 5min, because all
threads can be occupated and the real timeout between sends will be > 5min.
2020-05-10 12:37:38 +03:00
Gleb Novikov
c637d99e07 Volumes and storages refactoring:
1. Moved Volume to separate file
  2. Created IVolume interface and implemented current behaviour in implementation of new interface — VolumeJBOD
  3. Replaced all old volume usages with new VolumeJBOD. Where it is unnecessary to have JBOD — left just IVolume.
  4. Removed old Volume completely
  5. Moved StoragePolicy to separated files
  6. Moved DiskSelector to separated files
  7. Removed DiskSpaceMonitor file
2020-05-04 23:15:38 +03:00
Azat Khuzhin
6ffdd53b6a Share auto-increment for first batch and tmp file in DistributedBlockOutputStream 2020-05-03 14:47:59 +03:00
Azat Khuzhin
53c470cab4 Fix directory monitor initialization from INSERT into Distributed
This also fixes hardlink code (when one file should be sent to multiple
servers, i.e. internal_replication == false) of writeToShard() with
distributed_storage_policy (i.e. when StorageDistributed::getPath() will
path to different filesystems).

Plus also cleanup DistributedBlockOutputStream::writeToShard() a little.
2020-05-03 14:47:51 +03:00
Azat Khuzhin
e97e1f06db Do not schedule distributed send if there were no error
Since in this case it will be scheduled from the
DistributedBlockOutputStream with the
distributed_directory_monitor_max_sleep_time_ms, and this will overwrite
timer that was set by the DistributedBlockOutputStream, not good.
2020-05-03 14:46:44 +03:00
Azat Khuzhin
947b3942dd Schedule distributed sends after the file has been written 2020-05-03 14:46:43 +03:00
Azat Khuzhin
0157fd5d93 Fix distributed send that are scheduled by INSERT query
Before this patch each INSERT query re-schedule distributed send, thus
each time it resets the timer, while this is not the expected behaviour,
since in on frequent INSERT distributed sends will not be triggered at
all.

Fix this by not resetting the timer.
2020-05-03 14:46:42 +03:00
Azat Khuzhin
6bb39dafc3 Drop decreated code (cond var and note for thread) in DirectoryMonitor 2020-05-03 14:46:41 +03:00
Azat Khuzhin
63d8ab8f03 Make createSelector() static (in storage) and const (in stream) 2020-05-01 11:31:05 +03:00
Azat Khuzhin
f22ba15b4a Reduce copy-paste of DistributedBlockOutputStream::createSelector
This will make it less error prone.
2020-05-01 02:59:40 +03:00
Alexey Milovidov
1e325a9fd9 Checkpoint 2020-04-22 09:22:14 +03:00
Azat Khuzhin
5d11118cc9 Use thread pool (background_distributed_schedule_pool_size) for distributed sends
After #8756 the problem with 1 thread for each (distributed table, disk)
for distributed sends became even worse (since there can be multiple
disks), so use predefined thread pool for this tasks, that can be
controlled with background_distributed_schedule_pool_size knob.
2020-04-19 12:01:56 +03:00
Azat Khuzhin
673ddc9d77 Drop superfluous locking for atomic in DirectoryMonitor 2020-04-19 00:22:48 +03:00
Alexey Milovidov
8ad04d4fec Remove useless code 2020-04-15 00:05:45 +03:00
Azat Khuzhin
6d85207bfb Convert blocks if structure does not match on INSERT into Distributed()
Follow-up for: #10105
2020-04-08 23:46:01 +03:00
Azat Khuzhin
b2fa9d8750 Fix SIGSEGV on INSERT into Distributed on different struct with underlying 2020-04-08 02:35:31 +03:00
Ivan Lezhankin
06446b4f08 dbms/ → src/ 2020-04-03 18:14:31 +03:00
Azat Khuzhin
f53c9a6b25 Fix "Block structure mismatch" for INSERT into Distributed
Add missing conversion (via ConvertingBlockInputStream) for INSERT into
remote nodes (for sync insert, async insert and async batch insert),
like for local nodes (in DistributedBlockOutputStream::writeBlockConverted).

This is required when the structure of the Distributed table differs
from the structure of the local table.

And also add a warning message, to highlight this in logs (since this
works slower).

Fixes: #19888
2021-02-02 10:16:41 +03:00