Commit Graph

15583 Commits

Author SHA1 Message Date
Michael Kolupaev
418423a304 Slightly more things 2023-12-27 20:24:55 +00:00
Michael Kolupaev
ef4cc5ec7f Things 2023-12-27 20:24:55 +00:00
Michael Kolupaev
a7c369e14f Overhaul timestamp arithmetic 2023-12-27 20:24:55 +00:00
Michael Kolupaev
01369a0a8a Overhaul dependencies 2023-12-27 20:24:54 +00:00
Michael Kolupaev
01345981e2 Overhaul RefreshTask 2023-12-27 20:24:54 +00:00
Michael Kolupaev
5dc04a13a7 Simple review comments 2023-12-27 20:24:54 +00:00
koloshmet
808cb0fa05 fix fix fix 2023-12-27 20:24:54 +00:00
koloshmet
f1161566b4 proper tmp table cleanup 2023-12-27 20:24:54 +00:00
koloshmet
f14114dafc proper tmp table cleanup 2023-12-27 20:24:54 +00:00
koloshmet
d1932763f3 fixed style 2023-12-27 20:24:54 +00:00
koloshmet
c762898adb refreshable materialized views 2023-12-27 20:24:54 +00:00
Alexey Milovidov
f00337e2ba
Merge pull request #57872 from CurtizJ/optimize-aggregation-consecutive-keys
Better optimization of consecutive keys in aggregation
2023-12-27 15:44:22 +01:00
Azat Khuzhin
ebad1bf4f3 Move StorageKafka::createConsumer() into KafkaConsumer
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
03218202d3 Fix data-race between StorageKafka::startup() and cleanConsumers()
Actually now we can create consumer object in the ctor, no need to do
this in startup(), since consumer now do not connects to kafka.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
1f03a21033 Update comment for statistics.interval.ms librdkafka option
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
06a9e9a9ca Use separate thread for kafka consumers cleanup
Since pool may exceed threads, while we need to run this thread always
to avoid memory leaking.

And this should not be a problem since librdkafka has multiple threads
for each consumer (5!) anyway.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
b19b70b8fc Add ability to configure TTL for kafka consumers
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
2ff0bfb0a1 Preserve KafkaConsumer objects
This will make system.kafka_consumers more useful, since after TTL
consumer object will be removed prio this patch, but after, all
information will be preserved.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
db74549940 Enable stats for system.kafka_consumers back by default
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
e7592c140e Create consumers for Kafka tables on fly (but keep them for 1min since last used)
Pool of consumers created a problem for librdkafka internal statistics,
you need to read from the queue always, while in ClickHouse consumers
created regardless are there any readers or not (attached materialized
views or direct SELECTs).

Otherwise, this statistics messages got queued and never released,
which:
- creates live memory leak
- and also makes destroy very slow, due to librdkafka internals (it
  moves entries from this queue into another linked list, but in a
  with sorting, which is incredibly slow for linked lists)

So the idea is simple, let's create a pool of consumers only when they
are required, and destroy them after some timeout (right now it is 60
seconds) if nobody uses them, that way this problem should gone.

This should also reduce number of internal librdkafka threads, when
nobody reads from Kafka tables.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
51d4f583e6 Properly set shutdown_called in StorageKafka::shutdown()
Fixes: https://github.com/ClickHouse/ClickHouse/pull/42777
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
123d63e824 Remove StorageKafka::num_created_consumers (in favor of all_consumers.size())
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Alexey Milovidov
a0fccb0498
Merge pull request #58224 from amosbird/part_offset_pk
Primary key analysis for _part_offset
2023-12-26 14:51:57 +01:00
Alexey Milovidov
31a081bd83
Merge pull request #58226 from Algunenano/cleanup_known_short
Cleanup some known short messages
2023-12-26 14:40:58 +01:00
Raúl Marín
e87b9751bd Cleanup some known short messages 2023-12-26 12:58:50 +01:00
Amos Bird
66660ee4e2
Add comment 2023-12-26 17:04:00 +08:00
Amos Bird
bfcccf9fa3
Primary key analysis for _part_offset 2023-12-26 17:03:59 +08:00
santrancisco
a59d874bf9
fix syntax 2023-12-26 16:56:58 +11:00
Azat Khuzhin
3be3b0a280 Fix incorrect Exceptions
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-24 21:26:32 +01:00
Alexey Milovidov
ae51334ba5 Merge branch 'master' into fix-error-in-archive-reader 2023-12-24 05:53:22 +01:00
Alexey Milovidov
e98c49a58f Fix a benign error in archive reader 2023-12-24 05:44:24 +01:00
Alexey Milovidov
3f4c8e4ae8
Merge pull request #58167 from jrdi/part-log-uncompressed-bytes
Add bytes_uncompressed to system.part_log
2023-12-24 04:11:35 +01:00
Alexey Milovidov
b4bf1d1c4c
Merge pull request #58136 from azat/system.stack_trace-rt_tgsigqueueinfo-v2
Fix system.stack_trace for threads with blocked SIGRTMIN (resubmit)
2023-12-24 03:51:13 +01:00
Alexey Milovidov
4f3f69521d
Merge pull request #58173 from ClickHouse/parallel-replicas-used-count
Profile event 'ParallelReplicasUsedCount'
2023-12-24 03:46:09 +01:00
Alexey Milovidov
00fa9085b1
Merge pull request #58178 from chhetripradeep/add-base-backup-name-to-system-tables
Add base backup name to system.backups and system.backup_log tables
2023-12-24 03:38:20 +01:00
Azat Khuzhin
2f6c0487ad Ignore ENOENT for SigBlk check for system.stack_trace
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-23 14:35:38 +01:00
Azat Khuzhin
ac542199c5 Add some comments about racy code for system.stack_trace
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-23 13:42:26 +01:00
Igor Nikonov
d644a208bf Merge remote-tracking branch 'origin/master' into parallel-replicas-used-count 2023-12-23 11:02:28 +00:00
Igor Nikonov
3a485a8bbf Fix:moved request object was used 2023-12-23 11:02:24 +00:00
Alexey Milovidov
dc4b9a1013 Obfuscator: keep settings and timezones 2023-12-23 04:55:55 +01:00
Pradeep Chhetri
b5c8c4050b Add base backup name to system.backups and system.backup_log tables 2023-12-23 11:08:50 +08:00
Jordi Villar
bff0b9c790 Fix mutations new part uncompressed bytes 2023-12-22 22:33:58 +01:00
Igor Nikonov
1deafa1a00 Profile event 'ParallelReplicasUsedCount' 2023-12-22 20:54:52 +00:00
Jordi Villar
b4c3969d3a Add bytes_uncompressed to system.part_log 2023-12-22 18:35:33 +01:00
Alexey Milovidov
08ff37f64e
Merge pull request #57682 from azat/system.stack_trace/analyzer
Add support for system.stack_trace filtering optimizations for analyzer
2023-12-22 16:28:28 +01:00
Azat Khuzhin
d29762f19f Do not send signals to threads that blocks SIGRTMIN for system.stack_trace
That way we can avoid superior timeouts during reading from
system.stack_trace.

Two known cases of such threads are:
- rdk: -- librdkafka threads
- iou-wrk -- io_uring threads

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-22 12:41:20 +01:00
Azat Khuzhin
aa5a6449f0 Fix system.stack_trace for threads with blocked SIGRTMIN
Some third-party libraries (i.e. librdkafka) could block it, and in this
case system.stack_trace will return stacktrace for the main process
(usually, basically it could be any thread with non blocked signal).

By replacing sigqueue() with more precise rt_tgsigqueueinfo(), other
threads will not respond to the signal.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
(cherry picked from commit 106042cf41)
2023-12-21 19:41:56 +01:00
Raúl Marín
2e522b9405 Remove requestUnconditionalRetry
It's confusing, only me used it and I used it wrong
2023-12-21 19:19:50 +01:00
Raúl Marín
ceed935b30 Remove debug comment and fix unconditional_retry logic 2023-12-21 17:11:34 +01:00
Raúl Marín
6d9da8edd5 Merge remote-tracking branch 'blessed/master' into zk_retries_quorum 2023-12-21 17:03:29 +01:00