zhenjial
0f788d98f5
new implementation
2022-09-06 20:39:54 +08:00
zhenjial
18db90dcfc
Record errors while reading text formats (CSV, TSV).
2022-09-06 17:19:15 +08:00
Duc Canh Le
6950016b8a
fix grouping set with group_by_use_nulls
2022-09-06 09:39:27 +08:00
zhongyuankai
0bf76fe642
Merge branch 'compress-marks' into compress_marks_and_primary_key
2022-09-06 08:01:43 +08:00
Igor Nikonov
7a8b8e7a39
Optimizer setting: read in window order
...
optimization's setting is checked before applying it, not inside the optimization code
2022-09-05 20:47:11 +00:00
Igor Nikonov
30860290de
Continue
2022-09-05 20:12:00 +00:00
kssenii
83514fa2ef
Refactor
2022-09-05 20:08:22 +02:00
Alexey Milovidov
d7127e4b2d
Make it slightly more sane
2022-09-05 07:26:58 +02:00
Igor Nikonov
8fece1e2d2
Merge branch 'master' into sort_mode_rename
2022-09-04 21:44:33 +02:00
Alexey Milovidov
193cd1b3b2
Merge pull request #39138 from nickitat/control_block_size_in_aggregator
...
Control block size in aggregator
2022-09-04 04:51:00 +03:00
Igor Nikonov
70f779b81d
Just to save
2022-09-02 21:24:33 +00:00
Igor Nikonov
5d7fa55f36
Merge branch 'master' into sort_mode_rename
2022-09-02 23:19:04 +02:00
Kruglov Pavel
77071381e4
fix build
2022-09-02 16:37:33 +02:00
Vladimir C
963c0111bf
Merge pull request #39418 from vdimir/join_and_sets
...
Filter joined streams for `full_sorting_join` by each other before sorting
2022-09-02 13:57:06 +02:00
Antonio Andelic
e64436fef3
Fix typos with new codespell
2022-09-02 08:54:48 +00:00
Robert Schulze
319d8b00a7
Merge pull request #39010 from FrankChen021/tracing_context_propagation
...
Improve the opentelemetry tracing context propagation across threads
2022-09-02 07:56:43 +02:00
Robert Schulze
c7c00f9002
Merge pull request #40739 from ClickHouse/clang-tidy-for-headers
...
Enable clang-tidy for headers
2022-09-02 07:54:50 +02:00
avogar
afc34dca41
Add new JSON formats, add improvements and refactoring
2022-09-01 19:00:24 +00:00
Kruglov Pavel
7a4a65bc36
Make better exception message in schema inference
2022-09-01 20:36:08 +02:00
Kruglov Pavel
f53aa86a20
Merge pull request #40485 from arthurpassos/fix-parquet-chunked-array-deserialization
...
Add support for extended (chunked) arrays for Parquet format
2022-09-01 19:40:40 +02:00
Dmitry Novik
ddadb362cf
Merge pull request #39762 from quickhouse/betterorderbyoptimization
...
Fixed `Unknown identifier (aggregate-function)` exception which appears when a user tries to calculate WINDOW ORDER BY/PARTITION BY expressions over aggregate functions
2022-09-01 18:08:06 +02:00
Frank Chen
9d63cbe811
Merge 'origin/master' into tracing_context_propagation to resolve conflicts
2022-09-01 23:18:59 +08:00
Vladimir C
12e6fc4182
Merge branch 'master' into join_and_sets
2022-09-01 14:56:14 +02:00
Kseniia Sumarokova
c6c67a248d
Merge pull request #40792 from canhld94/ch_canh_intersect_distinct
...
Implement intersect + except distinct
2022-09-01 14:35:26 +02:00
Anton Popov
f7bdf07adc
Merge pull request #38715 from CurtizJ/fix-read-in-order-fixed-prefix
...
Better support of `optimize_read_in_order` in case of fixed prefix of sorting key
2022-09-01 12:59:18 +02:00
Robert Schulze
de64c6b103
Merge branch 'master' into clang-tidy-for-headers
2022-09-01 10:24:56 +02:00
Kruglov Pavel
86516d3bb4
Merge pull request #40740 from amosbird/row-policy-index-fix-1
...
Use index when row_policy_filter is always false
2022-08-31 18:46:14 +02:00
Robert Schulze
cedf75ed5e
Enable clang-tidy for headers
...
clang-tidy now also checks code in header files. Because the analyzer
finds tons of issues, activate the check only for directory "base/" (see
file ".clang-tidy"). All other directories, in particular "src/" are
left to future work.
While many findings were fixed, some were not (and suppressed instead).
Reasons for this include: a) the file is 1:1 copypaste of a 3rd-party
lib (e.g. pcg_extras.h) and fixing stuff would make upgrades/fixes more
difficult b) a fix would have broken lots of using code
2022-08-31 10:48:15 +00:00
Anton Popov
3504781529
Merge branch 'master' into fix-read-in-order-fixed-prefix
2022-08-30 23:32:43 +02:00
Dmitry Novik
0a8378d9cd
Merge branch 'master' into betterorderbyoptimization
2022-08-30 14:23:22 +02:00
vdimir
0f6f3c73b0
Minor fix
2022-08-30 11:57:28 +00:00
Duc Canh Le
8590cc46c4
implement intersect + except distinct
2022-08-30 18:09:01 +08:00
Frank Chen
f17d56b528
Merge branch 'master' into tracing_context_propagation
2022-08-30 14:24:36 +08:00
vdimir
24f62e8486
Throw an error in CreatingSetsOnTheFlyTransform in case of input for finished
2022-08-29 11:27:08 +00:00
vdimir
b0e2616aa9
Style fixes in CreateSetAndFilterOnTheFlyTransform and related
2022-08-29 11:26:21 +00:00
Anton Popov
2a3e012931
Merge branch 'master' into fix-read-in-order-fixed-prefix
2022-08-29 13:17:26 +02:00
vdimir
7915b6948f
Fix build after rebase
2022-08-29 09:49:16 +00:00
vdimir
afb6b7d9cf
Test plan and pipeline for filtering step for join
2022-08-29 09:49:15 +00:00
vdimir
afeff512b5
Aux port for ReadHeadBalancedProcessor is empty Block
2022-08-29 09:49:14 +00:00
vdimir
95f87dc34e
fix sanitizer assert in CreateSetAndFilterOnTheFlyStep
2022-08-29 09:49:12 +00:00
vdimir
c67ab33d90
small fix CreateSetAndFilterOnTheFlyStep
2022-08-29 09:49:11 +00:00
vdimir
51e02d09f6
set preserves_sorting = true for CreateSetAndFilterOnTheFlyStep
2022-08-29 09:49:10 +00:00
vdimir
714c53ab24
fix typos
2022-08-29 09:49:09 +00:00
vdimir
8e1632f824
Create sets for joins: better code
2022-08-29 09:49:08 +00:00
vdimir
7228091ff1
rename CreateSetAndFilterOnTheFlyTransform
2022-08-29 09:49:07 +00:00
vdimir
67a9acc8db
rename CreatingSetOnTheFlyStep -> CreateSetAndFilterOnTheFlyStep
2022-08-29 09:49:07 +00:00
vdimir
d82a75ae75
cleanup PingPongProcessor
2022-08-29 09:49:06 +00:00
vdimir
e472e13c70
move PingPongProcessor/ReadHeadBalancedProceesor into separate file
2022-08-29 09:49:05 +00:00
vdimir
51a51694d6
Create sets for joins: better code
2022-08-29 09:49:01 +00:00
vdimir
c778bba13f
Create sets for joins: wip
2022-08-29 09:47:00 +00:00
vdimir
31a167848d
Fix set finish condition in CreatingSetsOnTheFlyTransform
2022-08-29 09:46:59 +00:00
vdimir
71708d595f
Create sets for joins: wip
2022-08-29 09:46:59 +00:00
vdimir
8f06430ebd
Create sets for joins: upd
2022-08-29 09:46:58 +00:00
vdimir
3292566603
Format bytes in CreatingSetsOnTheFlyTransform logs
2022-08-29 09:46:57 +00:00
vdimir
031aaf3a45
Add Creating/FilterBySetsOnTheFlyStep for full sorting join
2022-08-29 09:46:57 +00:00
vdimir
c5bc7b0a0c
Resize pipeline after full sort join
2022-08-29 09:46:56 +00:00
Azat Khuzhin
f9812d9917
Fix memory leak while pushing to MVs w/o query context (from Kafka/...)
...
While pushign to MVs, there is a low-level code that create
ThreadGroupStatus/ThreadStatus, it is required to gather some metrics
for system.query_views_log.
But, one should not use ThreadGroupStatus of the MainThreadStatus, since
this structure can hold some state, that may not be cleaned, plus this
may be racy, instead it is better to create new ThreadGroupStatus and
attach it instead.
Also this place misses detachQuery(), and because of this it leaks
ThreadGroupStatus::finished_threads_counters_memory. But it is only the
problem pushing to MVs is done w/o query context (i.e. from Kafka/...),
since when it has query context detachQuery() will be called eventually.
Before this patch series, when I've tried the reproducer with
500 MVs attached to Kafka engine (that @den-crane suggested), jemalloc
report looks like this:
$ ../jeprof --text ~/ch/tmp/upstream/clickhouse-binary --base jeprof.44384.0.i0.heap jeprof.44384.167.i167.heap
Using local file /home/azat/ch/tmp/upstream/clickhouse-binary.
Using local file jeprof.44384.167.i167.heap.
Total: 915.6 MB
910.7 99.5% 99.5% 910.7 99.5% Snapshot (inline)
9.5 1.0% 100.5% 9.5 1.0% std::__1::__libcpp_operator_new (inline)
0.5 0.1% 100.6% 0.5 0.1% DB::TasksStatsCounters::create
And with focus to this place:
$ ../jeprof --focus Snapshot --text ~/ch/tmp/upstream/clickhouse-binary --base jeprof.44384.0.i0.heap jeprof.44384.167.i167.heap
Using local file /home/azat/ch/tmp/upstream/clickhouse-binary.
Using local file jeprof.44384.167.i167.heap.
Total: 915.6 MB
910.7 100.0% 100.0% 910.7 100.0% Snapshot (inline)
0.0 0.0% 100.0% 910.7 100.0% DB::QueryPipeline::reset
0.0 0.0% 100.0% 910.7 100.0% DB::StorageKafka::streamToViews
0.0 0.0% 100.0% 910.7 100.0% DB::StorageKafka::threadFunc
0.0 0.0% 100.0% 910.7 100.0% ProfileEvents::Counters::getPartiallyAtomicSnapshot
0.0 0.0% 100.0% 910.7 100.0% ~ThreadStatus
0.0 0.0% 100.0% 910.7 100.0% ~ViewRuntimeData
0.0 0.0% 100.0% 910.7 100.0% ~ViewRuntimeStats (inline)
Actually this report does not looks great (you understand it because I
stripped it), because --text does not that smart, but if you will use
--pdf for the report you will see the stacktrace (will attach pdf to the
pull request).
But after this patch series the process RSS does not goes beyond
~700MiB.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-29 11:36:33 +02:00
Amos Bird
15a69bce84
Use index when row_policy_filter is always false
2022-08-29 16:44:32 +08:00
Alexey Milovidov
365a600fdb
Merge branch 'force-documentation-3' of github.com:ClickHouse/ClickHouse into force-documentation-3
2022-08-27 22:28:54 +02:00
Alexey Milovidov
6b2e227c8b
Fix integration test
2022-08-27 22:28:38 +02:00
Vladimir C
e067629e0d
Merge pull request #40239 from vdimir/vdimir/tmp-file-metrics
...
More metrics for on-disk temporary data
2022-08-26 11:28:01 +02:00
Alexander Gololobov
6a69e08799
Merge pull request #40559 from ClickHouse/lwd_vertical_merge_fix
...
Fix vertical merge of parts with lightweight deleted rows
2022-08-25 20:47:44 +02:00
Frank Chen
bb00dcc19b
Remove using namespace from header
...
Signed-off-by: Frank Chen <frank.chen021@outlook.com>
2022-08-25 20:20:13 +08:00
Frank Chen
99c37ce6c6
Merge branch 'master' into tracing_context_propagation
2022-08-25 10:07:16 +08:00
Nikita Taranov
ac34a17551
Merge branch 'master' into control_block_size_in_aggregator
2022-08-24 20:25:28 +02:00
vdimir
91788f29e8
Upd TemporaryFileOnDisk
2022-08-24 16:15:54 +00:00
vdimir
7194df1184
Move back TemporaryFile -> TemporaryFileOnDisk
2022-08-24 16:14:11 +00:00
vdimir
0349c85017
Use getCompressedBytes in BufferingToFileTransform and TemporaryFileStream
2022-08-24 16:14:10 +00:00
vdimir
51c44424cc
More metrics for temp files
2022-08-24 16:14:09 +00:00
vdimir
1321ac87b5
Minor fixes
2022-08-24 16:14:07 +00:00
vdimir
7e0c9062c7
Add ProfileEvents::ExternalSort(Un)CompressedBytes
2022-08-24 16:14:07 +00:00
Kruglov Pavel
e6e7f5db93
Merge pull request #40491 from mini4/fix-settings-input_format_tsv_skip_first_lines
...
Fix bug in settings input_format_tsv_skip_first_lines of format TSV
2022-08-24 15:57:45 +02:00
Alexander Gololobov
1c2dd50ca5
Fix vertical merge of parts with lightweight deleted rows
2022-08-24 15:18:33 +02:00
Kruglov Pavel
0781e8b4f7
Merge pull request #40534 from Avogar/nested-in-avro
...
Support reading Array(Record) into flatten nested table in Avro
2022-08-24 13:33:12 +02:00
Frank Chen
cd19366b44
Move classes into DB::OpenTelemetry namespace
2022-08-24 16:41:40 +08:00
kgurjev
f62c2c3221
Fix bug in settings input_format_tsv_skip_first_lines of format TSV
2022-08-24 10:02:57 +03:00
Kruglov Pavel
72f02bd6eb
Merge pull request #40414 from Avogar/improve-schema-inference-cache
...
Improve schema inference cache, respect format settings that can change the schema
2022-08-23 17:04:58 +02:00
avogar
29a887578b
Fix
2022-08-23 11:42:57 +00:00
avogar
581e569d04
Support reading Array(Record) into flatten nested table in Avro
2022-08-23 11:05:02 +00:00
Arthur Passos
f8e2ab0a20
Use FileReader::GetRecordBatchReader instead of FileReader::ReadRowGroup to parse Parquet
2022-08-22 08:21:32 -03:00
Alexey Milovidov
ab91c99495
Merge branch 'master' into control_block_size_in_aggregator
2022-08-20 21:28:27 +03:00
Alexey Milovidov
74e1f4dc61
Fix clang-tidy
2022-08-20 17:09:20 +02:00
avogar
612ffaffde
Make schema inference cache better, respect format settings that can change the schema
2022-08-19 16:39:13 +00:00
Nikita Taranov
1b6e7b9ca2
Merge branch 'master' into sort_mode_rename
2022-08-19 12:31:59 +02:00
Kruglov Pavel
b67cb9e378
Merge pull request #40173 from Avogar/arrow-dict
...
Improve and fix dictionaries in Arrow format
2022-08-18 20:54:55 +02:00
Kruglov Pavel
09a2ff8843
Merge pull request #40293 from joshuataylor/feature/arrow-large-binary-string
...
Add support for LARGE_BINARY/LARGE_STRING with Arrow
2022-08-18 14:01:58 +02:00
avogar
a6318cecd5
Fix hive test
2022-08-18 11:32:42 +00:00
Nikolai Kochetov
5a85531ef7
Merge pull request #38286 from Avogar/schema-inference-cache
...
Add schema inference cache for s3/hdfs/file/url
2022-08-18 13:07:50 +02:00
Kruglov Pavel
d7056376eb
Merge pull request #40068 from Avogar/schema-inference-hints
...
Allow to specify structure hints in schema inference
2022-08-18 12:19:45 +02:00
Igor Nikonov
6fe8b61345
Merge branch 'master' into sort_mode_rename
2022-08-17 19:19:29 +02:00
Yakov Olkhovskiy
40fd6e189a
call readColumnWithStringData
2022-08-17 09:54:01 -04:00
Kruglov Pavel
19af748737
Fix typo
2022-08-17 14:29:09 +02:00
Kruglov Pavel
00d04456ff
Try reduce code duplication
2022-08-17 14:28:15 +02:00
Vladimir C
b876cc17c9
Merge pull request #39593 from quickhouse/fixexponentialdecaywindowfunctions
...
Fixed point of origin for exponential decay window functions to the last value in window
2022-08-17 14:19:59 +02:00
Igor Nikonov
5ceaeb9e12
Sorting mode renaming
...
+ sort mode -> sort scope
+ Stream -> Global
+ Port -> Stream
2022-08-17 12:19:36 +00:00
avogar
8dd54c043d
Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-cache
2022-08-17 11:47:40 +00:00
Igor Nikonov
46ed4f6cdf
Merge pull request #38719 from ClickHouse/skipping_sorting_step
...
SortingStep: deduce way to sort based on input stream sort description
2022-08-17 12:58:11 +02:00
Josh Taylor
628d2bbff5
Add support for LARGE_BINARY/LARGE_STRING with Arrow
2022-08-17 10:25:06 +08:00
Nikita Taranov
6bdbaccc37
use max_block_size from settings
2022-08-16 18:56:22 +02:00
Nikita Taranov
63bc894a42
more parallelism
2022-08-16 18:56:22 +02:00
Nikita Taranov
f650b23ee3
generate many blocks
2022-08-16 18:56:22 +02:00
Nikita Taranov
db0110fd7a
more accurate crutch
2022-08-16 18:56:22 +02:00
Nikita Taranov
e5e0a24ab3
return chunks from prepareBlockAndFillWithoutKey
2022-08-16 18:56:22 +02:00
Igor Nikonov
d4367de7bb
Rename setting to optimize_sorting_by_input_stream_properties
2022-08-16 16:27:41 +00:00
Vladimir Chebotaryov
3cc03b141e
Fixed tests on Debug build type.
2022-08-16 15:43:37 +02:00
Vladimir Chebotaryov
66f9bfca61
Fixed point of origin for exponential decay window functions to the last value in window.
2022-08-16 15:43:37 +02:00
avogar
99d8727335
Fix tests
2022-08-16 12:56:51 +00:00
avogar
936c457734
Remove unnended field
2022-08-16 09:51:52 +00:00
avogar
e1ff996ec3
Allow to specify structure hints in schema inference
2022-08-16 09:46:57 +00:00
Maksim Kita
110470809b
Merge pull request #40121 from amosbird/profile-processor-1
...
Extend processors_profile_log with more info
2022-08-16 09:49:12 +02:00
Igor Nikonov
aba00952f5
Fix: don't set sort mode in ReadFromMergeTree if sort description empty
2022-08-15 20:58:20 +00:00
Kruglov Pavel
2c5c0d6d47
Fix typo
2022-08-15 19:55:28 +02:00
avogar
ca0d883c0f
Fix possible segfault in CapnProto input format
2022-08-15 15:36:18 +00:00
Igor Nikonov
ea10fd65b8
Sorting properties in EXPLAIN PLAN
...
~ change formatting for sorting
~ rename sortmode option -> sorting
2022-08-15 15:14:59 +00:00
avogar
c160033837
Fix
2022-08-15 11:38:28 +00:00
Igor Nikonov
d83bea626c
Merge remote-tracking branch 'origin/master' into skipping_sorting_step
2022-08-13 21:46:34 +00:00
Igor Nikonov
f33a0d8c85
More simple way to check if sorting order is preserved
...
- there is a case where it's done wrong
2022-08-12 23:42:37 +00:00
avogar
78e197063c
Better example
2022-08-12 19:08:36 +00:00
avogar
763f84b623
Remove bad comment
2022-08-12 19:05:57 +00:00
avogar
9addded80e
Remove logging
2022-08-12 19:01:02 +00:00
avogar
000336622a
Remove logging
2022-08-12 18:59:52 +00:00
avogar
398576e9c9
Improve and fix dictionaries in Arrow format
2022-08-12 18:56:21 +00:00
Kseniia Sumarokova
a6cfc7bc3b
Merge pull request #34651 from alexX512/master
...
New caching strategies
2022-08-12 17:23:37 +02:00
Anton Popov
4bd50bb06c
Merge branch 'master' into distinct_sorted_simplify
2022-08-12 17:11:18 +02:00
Kruglov Pavel
4c7222d938
Merge pull request #40020 from canhld94/ch_canh_fix_hash
...
fix HashMethodOneNumber with const column
2022-08-12 14:40:24 +02:00
Amos Bird
99a38e41aa
processor profile
2022-08-11 21:03:34 +08:00
Igor Nikonov
75f6fcfa70
Merge remote-tracking branch 'origin/master' into skipping_sorting_step
2022-08-11 12:35:55 +00:00
Amos Bird
fa8fab2e8f
Fix KeyCondition with other filters
2022-08-11 19:20:44 +08:00
Maksim Kita
6bec0f5854
Merge pull request #38956 from vdimir/dict-join-refactoring
...
Join with dictionary refactoring
2022-08-11 11:54:11 +02:00
Vladimir C
2d44e6c458
Merge pull request #39343 from vdimir/refactor-prepared-sets
...
Refactor PreparedSets/SubqueryForSet
2022-08-11 11:19:18 +02:00
Vladimir Chebotaryov
748979a9c0
Merge branch 'master' into betterorderbyoptimization
2022-08-11 11:09:52 +03:00
Duc Canh Le
84cd867aa8
materialize column instead of handling column in hash method
2022-08-11 10:46:06 +08:00
Anton Popov
3fdf428834
Merge pull request #39186 from Avogar/numbers-schema-inference
...
Add new features in schema inference
2022-08-11 00:53:54 +02:00
vdimir
ad91c16ba0
Rename join_common -> JoinUtils
2022-08-10 14:20:28 +00:00
vdimir
b7c5c54181
Fix build
2022-08-10 13:43:55 +00:00
vdimir
5eb4cd39e0
Merge branch 'master' into refactor-prepared-sets
2022-08-10 11:47:49 +00:00
Maksim Kita
aff8149f5c
Merge pull request #39998 from kitaisreal/actions-dag-refactoring
...
ActionsDAG rename index to outputs
2022-08-10 11:44:18 +02:00
Igor Nikonov
754a9fb096
Merge remote-tracking branch 'origin/master' into skipping_sorting_step
2022-08-09 22:20:17 +00:00
Arthur Passos
c4d8ad2222
Add docs
2022-08-09 15:58:46 -03:00
Arthur Passos
e724e7bef6
Update arrow dict to lc comment
2022-08-09 15:52:37 -03:00
Arthur Passos
6eb89fd780
Fix both arrow dict de-serialization and dict of nullable de-serialization
2022-08-09 15:06:22 -03:00
Arthur Passos
be1e32c3f1
Merge branch 'ClickHouse:master' into fix_arrow_column_dictionary_to_ch_lc
2022-08-09 15:04:06 -03:00
Maksim Kita
acbfcf440b
Merge branch 'master' into actions-dag-refactoring
2022-08-09 18:52:08 +02:00
Igor Nikonov
70b52f7cb9
Fix test, review comments
2022-08-09 16:29:56 +00:00
Maksim Kita
a576a55375
Fixed build
2022-08-09 15:03:59 +02:00
Kruglov Pavel
088e8cf9bd
Merge branch 'master' into numbers-schema-inference
2022-08-09 14:00:36 +02:00
Kruglov Pavel
99b9e85a8f
Merge pull request #39646 from Avogar/more-formats
...
Add more Pretty formats
2022-08-09 13:59:47 +02:00
Igor Nikonov
366ead3828
Consider aliases when checking if sorting order is preserved by
...
expression
2022-08-09 11:27:17 +00:00
Igor Nikonov
1439664df6
EXPLAIN tests
2022-08-08 20:46:43 +00:00
Maksim Kita
c030fd05e7
ActionsDAG rename index to outputs
2022-08-08 18:01:32 +02:00