Commit Graph

22 Commits

Author SHA1 Message Date
Nikolai Kochetov
179ec05a72 Remove some streams. 2021-07-20 21:18:43 +03:00
Azat Khuzhin
a0209178cc Add ability to split distributed batch on failures
Add distributed_directory_monitor_split_batch_on_failure setting (OFF by
default), that will split the batch and send files one by one in case of
retriable errors.

v2: more error codes
2021-06-23 02:48:47 +03:00
Azat Khuzhin
9c6e8e1462 Add BrokenDistributedFilesToInsert new metric
Number of files for asynchronous insertion into Distributed tables that
has been marked as broken. This metric will starts from 0 on start.
Number of files for every shard is summed.
2021-05-04 22:48:07 +03:00
Azat Khuzhin
74269882f7 Add broken_data_files/broken_data_compressed_bytes into distribution_queue 2021-05-04 22:48:07 +03:00
Azat Khuzhin
79ed35876e DirectoryMonitor: Remove const qualifier and lots of mutable qualifiers 2021-03-03 23:30:24 +03:00
Azat Khuzhin
fcf49a4914 Distributed: Calculate counters for async INSERT at INSERT time
Previous patch fixes the inaccuracy, but it's done using iterating over
directory on each request (to system.distribution_queue or to check
bytes_to_throw_insert), and like previous patch alredy stated, it may
have pretty huge overhead (especially when you have lots of distributed
files pending).

This patch remove that recalculation (but it will still be done, and
if there is different, there will be a log message), and replace it with
proper account at INSERT time (and after file has been sent, or marked
as broken).
2021-03-03 23:30:03 +03:00
Azat Khuzhin
b43046ba06 Distributed: More accurate distribution_queue counters
So now system.distribution_queue will show accurate statistics, so tests
does not requires sleep anymore.

But note that with too much distributed pending this will iterate over
all directories.
2021-03-03 23:30:03 +03:00
Azat Khuzhin
8a00816396 Do not mark file for distributed send as broken on EOF
- the sender will got ATTEMPT_TO_READ_AFTER_EOF (added in
  946c275dfb) when the client just go
  away, i.e. server had been restarted, and this is incorrect to mark the
  file as broken in this case.

- since #18853 the file will be checked on the sender locally, and
  in case the file was truncated CANNOT_READ_ALL_DATA will be thrown.
  But before #18853 the sender will not receive
  ATTEMPT_TO_READ_AFTER_EOF from the client in case of file was truncated
  on the sender, since the client will just wait for more data, IOW just hang.

- and I don't see how ATTEMPT_TO_READ_AFTER_EOF can be received while
  reading local file.
2021-01-20 01:10:17 +03:00
Azat Khuzhin
fce8b6b5ef Refactoring distributed header parsing 2021-01-10 18:17:15 +03:00
Azat Khuzhin
ae0b15455f Add fsync_tmp_directory support into DirectoryMonitor 2021-01-09 16:31:52 +03:00
Azat Khuzhin
2e55bd2285 Accept IDisk in DirectoryMonitor (for further fsync) 2021-01-09 16:31:42 +03:00
Azat Khuzhin
59cdc964a1 Do not store reference to BackgroundSchedulePool in DirectoryMonitor (useless) 2020-11-05 23:43:34 +03:00
Azat Khuzhin
a588947fe2 Fix DistributedFilesToInsert metric (zeroed when it should not)
CurrentMetrics::Increment add amount for specified metric only for the
lifetime of the object, but this is not the intention, since
DistributedFilesToInsert is a gauge and after #10263 it can exit from
the callback (and enter again later, for example after SYSTEM STOP
DISTRIBUTED SEND it will always exit from it, until SYSTEM START
DISTRIBUTED SEND).

So make Increment member of a class (this will also fix possible issues
with substructing value on DROP TABLE).
2020-08-27 00:43:00 +03:00
Azat Khuzhin
86c5465bf8 Rewrite StorageSystemDistributionQueue interfaces 2020-06-04 03:04:32 +03:00
Azat Khuzhin
f0050adc51 Make system.distribution_queue metrics non racy 2020-06-04 02:36:16 +03:00
Azat Khuzhin
09c3ca9c6c Add last_exception into system.distribution_queue 2020-06-04 02:36:16 +03:00
Azat Khuzhin
389f78ceee Add system.distribution_queue
system.distribution_queue contains the following columns:
- database
- table
- data_path
- is_blocked
- error_count
- data_files
- data_compressed_bytes
2020-06-04 02:36:16 +03:00
Alexey Milovidov
25f941020b Remove namespace pollution 2020-05-31 00:57:37 +03:00
Alexey Milovidov
146370934a Keep the value of DistributedFilesToInsert metric on exceptions 2020-05-27 13:07:38 +03:00
Azat Khuzhin
6bb39dafc3 Drop decreated code (cond var and note for thread) in DirectoryMonitor 2020-05-03 14:46:41 +03:00
Azat Khuzhin
5d11118cc9 Use thread pool (background_distributed_schedule_pool_size) for distributed sends
After #8756 the problem with 1 thread for each (distributed table, disk)
for distributed sends became even worse (since there can be multiple
disks), so use predefined thread pool for this tasks, that can be
controlled with background_distributed_schedule_pool_size knob.
2020-04-19 12:01:56 +03:00
Ivan Lezhankin
06446b4f08 dbms/ → src/ 2020-04-03 18:14:31 +03:00