Commit Graph

93 Commits

Author SHA1 Message Date
Nikolai Kochetov
78e1db209f
Remove more data streams (#29491)
* Remove more streams.

* Fixing build.

* Fixing build.

* Rename files.

* Fix fast test.

* Fix StorageKafka.

* Try fix kafka test.

* Move createBuffer to KafkaSource ctor.

* Revert "Move createBuffer to KafkaSource ctor."

This reverts commit 81fa94d27e.

* Revert "Try fix kafka test."

This reverts commit 2107e54969.

* Comment some rows in test.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-10-07 11:26:08 +03:00
Azat Khuzhin
ae5ee2dd28 Move macros for distributed engine into separate header 2021-10-03 14:34:03 +03:00
Alexey Milovidov
fe6b7c77c7 Rename "common" to "base" 2021-10-02 10:13:14 +03:00
Nikolai Kochetov
341553febd Fix build. 2021-09-16 20:40:42 +03:00
Nikolai Kochetov
66a76ab70f Rewrite PushingToViewsBlockOutputStream part 6 2021-09-03 20:29:36 +03:00
Nikolai Kochetov
3ed3f7a9f7 Fix integration tests. 2021-07-22 13:38:22 +03:00
Nikolai Kochetov
65d3e713d6 Fix another one test. 2021-07-21 15:16:13 +03:00
Nikolai Kochetov
179ec05a72 Remove some streams. 2021-07-20 21:18:43 +03:00
alexey-milovidov
b16e01507f
Merge pull request #26464 from azat/ubsan-dir-mon-fix
Fix undefined-behavior in DirectoryMonitor (for exponential back off)
2021-07-17 18:18:42 +03:00
alexey-milovidov
ca37548888
Merge pull request #26430 from azat/fix-dist-msg
Fix "While sending batch" (on Distributed async send)
2021-07-17 13:01:36 +03:00
Azat Khuzhin
d2967ffa0b Fix undefined-behavior in DirectoryMonitor (for exponential back off)
UBsan reports [1]:

    ../src/Storages/Distributed/DirectoryMonitor.cpp:435:54: runtime error: 2.30584e+19 is outside the range of representable values of type 'unsigned long'

  [1]: https://clickhouse-test-reports.s3.yandex.net/0/10f3500b3be73c9498d994d189784c7d44ed6793/stress_test_(undefined).html#fail1
2021-07-17 12:18:05 +03:00
Azat Khuzhin
80e614318c Fix "While sending batch" (on Distributed async send) 2021-07-16 22:27:46 +03:00
Azat Khuzhin
a3653bd665 Fix overflow in exponential sleep in DirectoryMonitor
UBsan reports:

    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/Storages/Distributed/DirectoryMonitor.cpp:435:53 in
    ../src/Storages/Distributed/DirectoryMonitor.cpp:435: runtime error: 1.15292e+19 is outside the range of representable values of type 'long'
        0 0x1df0c286 in DB::StorageDistributedDirectoryMonitor::run() obj-x86_64-linux-gnu/../src/Storages/Distributed/DirectoryMonitor.cpp:435:53

It is pretty easy to reproduce by limiting max_server_memory_usage
before staring the test.
2021-07-16 04:10:47 +03:00
Azat Khuzhin
f3d3ec44a6 Add ability to set Distributed directory monitor settings via CREATE TABLE 2021-07-16 04:10:47 +03:00
alexey-milovidov
0a26687115
Merge pull request #23864 from azat/dist-split-batch-and-retry
Add ability to split distributed batch on failures (i.e. due to memory limits)
2021-06-27 19:28:26 +03:00
Azat Khuzhin
3bd53c68f9 Try to split the batch in case of broken batch too
Broken batches may be because of abnormal server shutdown (and lack of
fsync), and ignoring the whole batch is not great in this case, so apply
the same split logic here too.

v2: rename exception
v3: catch missing exception
v4: fix marking the file as broken multiple times (fixes
test_insert_distributed_async_send with setting enabled)
2021-06-23 02:48:47 +03:00
Azat Khuzhin
a0209178cc Add ability to split distributed batch on failures
Add distributed_directory_monitor_split_batch_on_failure setting (OFF by
default), that will split the batch and send files one by one in case of
retriable errors.

v2: more error codes
2021-06-23 02:48:47 +03:00
Azat Khuzhin
e148ef739d Drop replicas from dirname for internal_replication=true
Under use_compact_format_in_distributed_parts_names=1 and
internal_replication=true the server encodes all replicas for the
directory name for async INSERT into Distributed, and the directory name
looks like:

    shard1_replica1,shard1_replica2,shard3_replica3

This is required for creating connections (to specific replicas only),
but in case of internal_replication=true, this can be avoided, since
this path will always includes all replicas.

This patch replaces all replicas with "_all_replicas" marker.

Note, that initial problem was that this path may overflow the NAME_MAX
if you will have more then 15 replicas, and the server will fail to
create the directory.

Also note, that changed directory name should not be a problem, since:
- empty directories will be removed since #16729
- and replicas encoded in the directory name is also supported anyway.
2021-06-23 02:47:38 +03:00
kssenii
ab1a05a1f4 Poco::Path to fs::path, less concatination 2021-05-09 14:59:49 +03:00
kssenii
02288359c5 Less manual concatenation of paths 2021-05-08 13:59:55 +03:00
kssenii
2dabdd0f73 Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs 2021-05-05 18:42:40 +03:00
Azat Khuzhin
9c6e8e1462 Add BrokenDistributedFilesToInsert new metric
Number of files for asynchronous insertion into Distributed tables that
has been marked as broken. This metric will starts from 0 on start.
Number of files for every shard is summed.
2021-05-04 22:48:07 +03:00
Azat Khuzhin
74269882f7 Add broken_data_files/broken_data_compressed_bytes into distribution_queue 2021-05-04 22:48:07 +03:00
Azat Khuzhin
5e33604c4d Add file paths into logs on failed distributed async sends 2021-05-03 08:55:38 +03:00
kssenii
deb4903af8 Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs 2021-04-28 20:57:13 +03:00
kssenii
1e4a61ce63 Fix build 2021-04-27 20:22:39 +03:00
kssenii
eeb71672a0 Change in Storages/* 2021-04-27 16:49:37 +03:00
Alexey Milovidov
77e64b3ebd Merge branch 'master' into protocol-compression-auto 2021-04-17 16:46:51 +03:00
Azat Khuzhin
d2cf03ea41 Change logging from trace to debug for messages with rows/bytes 2021-04-15 21:00:16 +03:00
Alexey Milovidov
6f56c3280f Uncompress data in Distributed sends if needed 2021-04-14 00:53:39 +03:00
Ivan
495c6e03aa
Replace all Context references with std::weak_ptr (#22297)
* Replace all Context references with std::weak_ptr

* Fix shared context captured by value

* Fix build

* Fix Context with named sessions

* Fix copy context

* Fix gcc build

* Merge with master and fix build

* Fix gcc-9 build
2021-04-11 02:33:54 +03:00
Azat Khuzhin
c27b931f6a Slightly improve logging messages for Distributed async sends
- add took time (in ms)
- add rows/bytes
2021-04-08 08:10:39 +03:00
Azat Khuzhin
27d4fbd13b Compare Block itself for distributed async INSERT batches
INSERT into Distributed with insert_distributed_sync=1 stores the
distributed batches on the disk for sending in background.

But types may be a little bit different for the Distributed and it's
underlying table, so the initiator need to know whether conversion is
required or not.

Before this patch those on disk distributed batches contains header,
which includes dumpStructure() for the block in that batch, however it
checks not only names and types and plus dumpStructure() is a debug
method.

So instead of storing string representation for the block header we
should store empty block in the file header (note, that we cannot store
the empty block not in header, since this will require reading all
blocks from file, due to some trickery of the readers interface).

Note, that this patch also contains tiny refactoring:
- s/header/distributed_header/

v1: dumpNamesAndTypes()
v2: dump empty block into the batch itself
v3: move empty block into the header
2021-04-06 10:05:21 +03:00
Azat Khuzhin
79ed35876e DirectoryMonitor: Remove const qualifier and lots of mutable qualifiers 2021-03-03 23:30:24 +03:00
Azat Khuzhin
15f7459cae Distributed/DirectoryMonitor: protect metric_pending_files with metrics_lock
Since there is local value, that is not atomic, anyway we already have
lock for metrics, so it is fine.
2021-03-03 23:30:03 +03:00
Azat Khuzhin
70049db143 CurrentMetrics/Increment: Introduce add() 2021-03-03 23:30:03 +03:00
Azat Khuzhin
017c054a35 Distributed/DirectoryMonitor: Use std::lock_guard over std::unique_lock
It is more natural, since we do not need lazy locking.
2021-03-03 23:30:03 +03:00
Azat Khuzhin
fcf49a4914 Distributed: Calculate counters for async INSERT at INSERT time
Previous patch fixes the inaccuracy, but it's done using iterating over
directory on each request (to system.distribution_queue or to check
bytes_to_throw_insert), and like previous patch alredy stated, it may
have pretty huge overhead (especially when you have lots of distributed
files pending).

This patch remove that recalculation (but it will still be done, and
if there is different, there will be a log message), and replace it with
proper account at INSERT time (and after file has been sent, or marked
as broken).
2021-03-03 23:30:03 +03:00
Azat Khuzhin
b43046ba06 Distributed: More accurate distribution_queue counters
So now system.distribution_queue will show accurate statistics, so tests
does not requires sleep anymore.

But note that with too much distributed pending this will iterate over
all directories.
2021-03-03 23:30:03 +03:00
Azat Khuzhin
02198d091e Add proper checks while parsing directory names for async INSERT (fixes SIGSEGV) 2021-02-15 10:53:41 +03:00
Azat Khuzhin
f53c9a6b25 Fix "Block structure mismatch" for INSERT into Distributed
Add missing conversion (via ConvertingBlockInputStream) for INSERT into
remote nodes (for sync insert, async insert and async batch insert),
like for local nodes (in DistributedBlockOutputStream::writeBlockConverted).

This is required when the structure of the Distributed table differs
from the structure of the local table.

And also add a warning message, to highlight this in logs (since this
works slower).

Fixes: #19888
2021-02-02 10:16:41 +03:00
Anton Popov
c7070da85a better abstractions in disk interface 2021-01-26 17:49:35 +03:00
Azat Khuzhin
109dbe5df4 Check the stream before sending while hanlding async INSERTs into Distributed
It is possible to get corruption (even though it is very unlikely, and
initially it wasn't corruption) just before the data block goes in the
file on disk, and in case of batching, it will break the packets, since
it will write the packet type but will not write any data after.
2021-01-22 21:29:58 +03:00
Azat Khuzhin
8a00816396 Do not mark file for distributed send as broken on EOF
- the sender will got ATTEMPT_TO_READ_AFTER_EOF (added in
  946c275dfb) when the client just go
  away, i.e. server had been restarted, and this is incorrect to mark the
  file as broken in this case.

- since #18853 the file will be checked on the sender locally, and
  in case the file was truncated CANNOT_READ_ALL_DATA will be thrown.
  But before #18853 the sender will not receive
  ATTEMPT_TO_READ_AFTER_EOF from the client in case of file was truncated
  on the sender, since the client will just wait for more data, IOW just hang.

- and I don't see how ATTEMPT_TO_READ_AFTER_EOF can be received while
  reading local file.
2021-01-20 01:10:17 +03:00
Azat Khuzhin
ecae6c1c60 Avoid reading the distributed batch just to read the block header
Before this patch batched mode of the DirectoryMonitor is 2x slower then
non-batched, after it should be more or less the same as non-batched.
2021-01-14 22:38:46 +03:00
Azat Khuzhin
56475774d3 Fix readability-static-definition-in-anonymous-namespace in DirectoryMonitor 2021-01-10 23:57:40 +03:00
Azat Khuzhin
2565d2ac44 Verify compressed headers while sending distributed batches
Before this patch the DirectoryMonitor was checking the compressed file
by reading it one more time (since w/o this receiver may stuck on
truncated file), while this is ineffective and we can just check the
checksums before sending.

But note that this may decrease batch size that is used for sending over
network.
2021-01-10 21:23:42 +03:00
Azat Khuzhin
819b9d7d56 Add more metadata into distributed .bin files to avoid doing the same on sending
Before this patch StorageDistributedDirectoryMonitor reading .bin files
in batch mode, just to calculate number of bytes/rows, this is very
ineffective, let's just store them in the header (rows/bytes).
2021-01-10 18:17:15 +03:00
Azat Khuzhin
fce8b6b5ef Refactoring distributed header parsing 2021-01-10 18:17:15 +03:00
Azat Khuzhin
676bc83c6d Check per-block checksum of the distributed batch on the sender before sending
This is already done for distributed_directory_monitor_batch_inserts=1,
so let's do the same for the non batched mode, since otherwise in case
the file will be truncated the receiver will just stuck (since it will
wait for the block, but the sender will not send it).
2021-01-10 18:17:14 +03:00