Commit Graph

131464 Commits

Author SHA1 Message Date
Alexey Milovidov
f00337e2ba
Merge pull request #57872 from CurtizJ/optimize-aggregation-consecutive-keys
Better optimization of consecutive keys in aggregation
2023-12-27 15:44:22 +01:00
Alexey Milovidov
27bcdbe14a
Update src/Core/Settings.h
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2023-12-27 17:41:30 +03:00
Alexey Milovidov
1a93fd7f7d
Merge pull request #57829 from azat/kafka-fix-stat-leak
Create consumers for Kafka tables on fly (but keep them for some period since last used)
2023-12-27 15:35:05 +01:00
Alexey Milovidov
ad42115994
Merge pull request #58175 from Avogar/fix-array-subcolumns-read
Fix possible PARAMETER_OUT_OF_BOUND error during subcolumns reading from wide part in MergeTree
2023-12-27 15:30:02 +01:00
robot-clickhouse-ci-2
bd43660255
Merge pull request #58245 from Algunenano/fix_perf_readme
Fix perf test README
2023-12-27 13:33:50 +01:00
Alexey Milovidov
5f6318e51b
Merge pull request #58151 from ClickHouse/fix-legend
Fix dashboard legend sorting and rows number
2023-12-27 13:33:00 +01:00
Alexey Milovidov
d0bc4fafdb
Merge pull request #58220 from wangtZJU/fix_AddDefaultDatabaseVisitor_bad_performance
fix CREATE VIEW hang
2023-12-27 13:30:37 +01:00
Alexey Milovidov
77ec4a0422
Merge pull request #58218 from kitaisreal/merge-tree-automatically-derive-do-not-merge-across-partitions-select-final-setting
MergeTree derive do_not_merge_across_partitions_select_final setting
2023-12-27 13:26:17 +01:00
Raúl Marín
f8d9a850c7 Fix perf test README 2023-12-27 12:16:17 +00:00
Azat Khuzhin
ebad1bf4f3 Move StorageKafka::createConsumer() into KafkaConsumer
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
03218202d3 Fix data-race between StorageKafka::startup() and cleanConsumers()
Actually now we can create consumer object in the ctor, no need to do
this in startup(), since consumer now do not connects to kafka.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
1f03a21033 Update comment for statistics.interval.ms librdkafka option
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
06a9e9a9ca Use separate thread for kafka consumers cleanup
Since pool may exceed threads, while we need to run this thread always
to avoid memory leaking.

And this should not be a problem since librdkafka has multiple threads
for each consumer (5!) anyway.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
a7453f7f14 Allow setThreadName() to truncate thread name instead of throw an error
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
b19b70b8fc Add ability to configure TTL for kafka consumers
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
2ff0bfb0a1 Preserve KafkaConsumer objects
This will make system.kafka_consumers more useful, since after TTL
consumer object will be removed prio this patch, but after, all
information will be preserved.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
db74549940 Enable stats for system.kafka_consumers back by default
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
e7592c140e Create consumers for Kafka tables on fly (but keep them for 1min since last used)
Pool of consumers created a problem for librdkafka internal statistics,
you need to read from the queue always, while in ClickHouse consumers
created regardless are there any readers or not (attached materialized
views or direct SELECTs).

Otherwise, this statistics messages got queued and never released,
which:
- creates live memory leak
- and also makes destroy very slow, due to librdkafka internals (it
  moves entries from this queue into another linked list, but in a
  with sorting, which is incredibly slow for linked lists)

So the idea is simple, let's create a pool of consumers only when they
are required, and destroy them after some timeout (right now it is 60
seconds) if nobody uses them, that way this problem should gone.

This should also reduce number of internal librdkafka threads, when
nobody reads from Kafka tables.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
51d4f583e6 Properly set shutdown_called in StorageKafka::shutdown()
Fixes: https://github.com/ClickHouse/ClickHouse/pull/42777
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Azat Khuzhin
123d63e824 Remove StorageKafka::num_created_consumers (in favor of all_consumers.size())
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 09:49:07 +01:00
Alexey Milovidov
a0fccb0498
Merge pull request #58224 from amosbird/part_offset_pk
Primary key analysis for _part_offset
2023-12-26 14:51:57 +01:00
Alexey Milovidov
4fbe41b47e
Update 02950_part_offset_as_primary_key.sql 2023-12-26 16:51:39 +03:00
Alexey Milovidov
31a081bd83
Merge pull request #58226 from Algunenano/cleanup_known_short
Cleanup some known short messages
2023-12-26 14:40:58 +01:00
Alexey Milovidov
ee1d7f25de
Merge pull request #58221 from ClickHouse/fix_syntax_and_doc
Fix syntax and doc
2023-12-26 14:38:54 +01:00
Yarik Briukhovetskyi
41c275274b
Merge pull request #58148 from yariks5s/fix_s3_regions
S3-links region independency
2023-12-26 12:59:00 +01:00
Raúl Marín
e87b9751bd Cleanup some known short messages 2023-12-26 12:58:50 +01:00
Maksim Kita
71921086ae Fixed tests 2023-12-26 12:41:21 +03:00
Maksim Kita
cbf9304d1f MergeTree automatically derive do_not_merge_across_partitions_select_final setting 2023-12-26 12:41:21 +03:00
Amos Bird
66660ee4e2
Add comment 2023-12-26 17:04:00 +08:00
Amos Bird
bfcccf9fa3
Primary key analysis for _part_offset 2023-12-26 17:03:59 +08:00
santrancisco
91835256aa
fix doc 2023-12-26 16:57:18 +11:00
santrancisco
a59d874bf9
fix syntax 2023-12-26 16:56:58 +11:00
wangtao.2077
7ad278eefb fix AddDefaultDatabaseVisitor bad performance 2023-12-26 13:31:45 +08:00
Nikolay Degterinsky
85b149395a
Merge pull request #57796 from evillique/replicated-database-forbid-create-as-select
Forbid CREATE AS SELECT for database Replicated
2023-12-25 19:43:28 +01:00
Alexey Milovidov
005023a16d
Merge pull request #58210 from chhetripradeep/pchhetri/param-query-clickhouse-local
Add support for specifying query parameters in the command line in clickhouse-local
2023-12-25 13:34:04 +01:00
Alexey Milovidov
35e27ab1a3
Merge pull request #58213 from azat/tests/processes-cleanup-v2
Fix leftover processes/hangs in tests (resubmit)
2023-12-25 05:07:43 +01:00
Alexey Milovidov
9196c2b994 Follow-up 2023-12-25 04:54:54 +01:00
Alexey Milovidov
c2f93ecf4d
Merge pull request #58211 from ClickHouse/binary-viewer
Binary (symbols) viewer
2023-12-25 04:52:57 +01:00
Nikolay Degterinsky
98a6d67ae3 Disable tests with CREATE AS SELECT for database Replicated 2023-12-24 23:49:26 +00:00
Nikolay Degterinsky
d524951416
Merge pull request #58198 from azat/exception-fmt
Fix all Exception with missing arguments
2023-12-25 00:40:18 +01:00
Azat Khuzhin
3be3b0a280 Fix incorrect Exceptions
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-24 21:26:32 +01:00
Azat Khuzhin
435e1de7b0 Remove Exception's ctor to create it from a simple string-like object
This may cause troubles, like forgetting to pass arguments, and there
are few places in the code (see the upcomming patch).

I doubt that this will make any performance changes, since the check
should be compile time.

And anyway Exception is an exceptional situation which should be rare
(there is no such code with single argument for logging, while logging
is more common).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-24 21:26:31 +01:00
Nikolay Degterinsky
033fb14a2a Merge remote-tracking branch 'upstream/master' into replicated-database-forbid-create-as-select 2023-12-24 20:07:12 +00:00
Azat Khuzhin
c5dbde8407 Replace timeout --foreground with one workaround
The problem with --foreground option is that it send the signal only to
the process that had been spawned by timeout(1), while it can create
lots of children, and when you killing parent you are closing pipes and
childrens will get EPIPE, like in [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/0/069f8bbb2f48541cc736903e1da5459fa2c27da0/stateless_tests__debug__%5B2_5%5D.html

Another problem is that now child process will finish correctly, which
may also print some errors like QUERY_WAS_CANCELLED (see [2]).

  [2]: https://s3.amazonaws.com/clickhouse-test-reports/0/ef66714bf20042ba9cb5d59b7839befe26110b93/stateless_tests__release__analyzer_.html

In general this is not required actually, since all timeout invocations
uses timeout value less then the default test limit (10min). But it may
leave some processes in case of overriding this limit, i.e.
`clickhouse-test --timeout 1`

So to workaround this at least somehow, let's send SIGTERM and only
after some timeout (here I use 0.1), SIGKILL. This will give at least
some ability to terminate all childrens that had been spawned by
timeout(1).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-24 13:23:09 +01:00
Pradeep Chhetri
1ed19fc8b1 Add test 2023-12-24 19:59:14 +08:00
Alexey Milovidov
325ccacc45
Merge pull request #58208 from ClickHouse/dbase
Fix DWARFBlockInputFormat using wrong base address sometimes
2023-12-24 12:33:21 +01:00
Alexey Milovidov
6eaa17d5a4
Merge pull request #58206 from ClickHouse/fix-error-in-archive-reader
Fix error in archive reader
2023-12-24 12:32:49 +01:00
Alexey Milovidov
658336f674 Add documentation 2023-12-24 12:22:31 +01:00
Alexey Milovidov
7c18530a8c Add a test 2023-12-24 12:19:31 +01:00
Alexey Milovidov
0e89d01b94 Binary (symbols) viewer 2023-12-24 12:14:45 +01:00