Before this patch StorageDistributedDirectoryMonitor reading .bin files
in batch mode, just to calculate number of bytes/rows, this is very
ineffective, let's just store them in the header (rows/bytes).
Two new settings (by analogy with MergeTree family) has been added:
- `fsync_after_insert` - Do fsync for every inserted. Will decreases
performance of inserts.
- `fsync_tmp_directory` - Do fsync for temporary directory (that is used
for async INSERT only) after all part operations (writes, renames,
etc.).
Refs: #17380 (p1)
* Distributed insertion to one random shard
* add some tests
* add some documentation
* Respect shards' weights
* fine locking
Co-authored-by: Ivan Lezhankin <ilezhankin@yandex-team.ru>
The following headers are pretty generic, so use forward declaration as
much as possible:
- Context.h
- Settings.h
- ConnectionTimeouts.h
(Also this shows that some missing some includes -- this has been fixed)
And split ConnectionTimeouts.h into ConnectionTimeoutsContext.h (since
module part cannot be added for it, due to recursive build dependencies
that will be introduced)
Also remove Settings from the RemoteBlockInputStream/RemoteQueryExecutor
and just pass the context, since settings was passed only in speicifc
places, that can allow making a copy of Context (i.e. Copier).
Approx results (How much units will be recompiled after changing file X?):
- ConnectionTimeouts.h
- mainline: 100
- Context.h:
- mainline: ~800
- patched: 415
- Settings.h:
- mainline: 900-1K
- patched: 440 (most of them because of the Context.h)
Before this patch use_compact_format_in_distributed_parts_names was
applied only from default profile (at server start) for
internal_replication=1, and was ignored on INSERT.
1. Moved Volume to separate file
2. Created IVolume interface and implemented current behaviour in implementation of new interface — VolumeJBOD
3. Replaced all old volume usages with new VolumeJBOD. Where it is unnecessary to have JBOD — left just IVolume.
4. Removed old Volume completely
5. Moved StoragePolicy to separated files
6. Moved DiskSelector to separated files
7. Removed DiskSpaceMonitor file
This also fixes hardlink code (when one file should be sent to multiple
servers, i.e. internal_replication == false) of writeToShard() with
distributed_storage_policy (i.e. when StorageDistributed::getPath() will
path to different filesystems).
Plus also cleanup DistributedBlockOutputStream::writeToShard() a little.
After #8756 the problem with 1 thread for each (distributed table, disk)
for distributed sends became even worse (since there can be multiple
disks), so use predefined thread pool for this tasks, that can be
controlled with background_distributed_schedule_pool_size knob.