Commit Graph

824 Commits

Author SHA1 Message Date
avogar
7fcdb08ec6 Detect header in CSV/TSV/CustomSeparated files automatically 2023-01-05 22:57:25 +00:00
Yakov Olkhovskiy
7a5a36cbed
Merge branch 'master' into refactoring-ip-types 2023-01-04 11:11:06 -05:00
avogar
1f3d75cbf2 Better 2023-01-04 14:58:17 +00:00
Kruglov Pavel
7062054d60
Merge branch 'master' into schema-inference-uint 2023-01-04 14:50:01 +01:00
Nikolai Kochetov
da26f62a9b Fix right offset for reading LowCardinality dictionary from remote fs in case if right mark was in the middle of compressed block. 2023-01-03 18:19:51 +00:00
Alexey Milovidov
e855d3519a
Merge branch 'master' into refactoring-ip-types 2023-01-02 21:58:53 +03:00
avogar
73fecae5ff Fix comments 2023-01-02 15:31:07 +00:00
Kruglov Pavel
0a43976977
Merge branch 'master' into validate-types 2023-01-02 16:10:14 +01:00
Kruglov Pavel
69b9842bc6
Merge branch 'master' into schema-inference-uint 2022-12-30 18:16:00 +01:00
Kruglov Pavel
4982d132fb
Merge branch 'master' into validate-types 2022-12-30 17:52:13 +01:00
Kruglov Pavel
894726bd8f
Merge branch 'master' into improve-streaming-engines 2022-12-29 22:59:45 +01:00
Kruglov Pavel
150a699dda
Merge pull request #44546 from Avogar/better-object-as-string-inference
Improve json object as string inference
2022-12-29 21:58:46 +01:00
avogar
1ce69371fb Infer UInt64 in case of Int64 overflow 2022-12-28 21:46:08 +00:00
Raúl Marín
5de11979ce
Unify query elapsed time measurements (#43455)
* Unify query elapsed time reporting

* add-test: Make shell tests executable

* Add some tests around query elapsed time

* Style and ubsan
2022-12-28 21:01:41 +01:00
avogar
411f98306a Merge branch 'master' of github.com:ClickHouse/ClickHouse into validate-types 2022-12-27 19:24:15 +00:00
Kruglov Pavel
819e7a3008
Merge pull request #44550 from Avogar/better-json-tuples-to-arrays-inference
Improve inferring arrays with nulls in JSON formats
2022-12-27 18:22:13 +01:00
Kruglov Pavel
ac162a2c49
Merge pull request #44522 from Avogar/zero-numbers
Infer numbers starting from zero as strings in TSV
2022-12-27 17:00:10 +01:00
avogar
798c3111ed Improve inferring arrays with nulls in JSON formats 2022-12-24 00:21:48 +00:00
avogar
331f4bfee1 Fix 2022-12-23 19:58:50 +00:00
avogar
f15bf1839a Add missed settings into additional cache info 2022-12-23 19:52:54 +00:00
avogar
8dfe90a6c1 Improve json object as string inference 2022-12-23 19:44:13 +00:00
avogar
123392c996 Fix tests 2022-12-23 14:42:38 +00:00
Vladimir C
7482ea54ab
Merge pull request #43972 from ClickHouse/vdimir/tmp-data-in-fs-cache-2 2022-12-23 11:59:27 +01:00
avogar
f555048ae5 Infer numbers starting from zero as strings in TSV 2022-12-22 21:55:39 +00:00
Dmitry Novik
cff882d506 Merge remote-tracking branch 'origin/master' into refector-function-node 2022-12-22 21:34:29 +00:00
Kruglov Pavel
6a017a6586
Merge pull request #43379 from Avogar/better-capn-proto
Add small improvements in CapnProto format
2022-12-22 14:50:10 +01:00
vdimir
d30d25dbbe
Temporary files evict fs cache 2022-12-22 10:22:49 +00:00
Yakov Olkhovskiy
a8cb29da4b
Merge branch 'master' into refactoring-ip-types 2022-12-21 23:56:24 -05:00
avogar
4ab3e90382 Validate types in table function arguments/CAST function arguments/JSONAsObject schema inference 2022-12-21 21:21:30 +00:00
Kruglov Pavel
5e01a3d74e
Merge branch 'master' into improve-streaming-engines 2022-12-21 10:51:50 +01:00
Kruglov Pavel
09ab5832b1
Merge pull request #44382 from Avogar/fix-bson-object-id
Fix reading ObjectId in BSON schema inference
2022-12-21 10:48:50 +01:00
Dmitry Novik
4793412887
Merge branch 'master' into refector-function-node 2022-12-20 18:26:19 +01:00
Kruglov Pavel
c0b17ca0af
Merge branch 'master' into fix-bson-object-id 2022-12-20 17:18:10 +01:00
avogar
21cdf6e6ae Fix reading ObjectId in BSON schema inference 2022-12-19 14:13:42 +00:00
Dmitry Novik
875a24a650 Merge remote-tracking branch 'origin/master' into refector-function-node 2022-12-16 16:07:30 +00:00
avogar
4a51bdce86 Fix comments 2022-12-16 13:58:54 +00:00
Kruglov Pavel
3fad5c7f1f
Merge branch 'master' into refactor-schema-inference 2022-12-16 14:24:51 +01:00
avogar
cfcb444699 Merge branch 'master' of github.com:ClickHouse/ClickHouse into better-capn-proto 2022-12-15 20:04:43 +00:00
Kruglov Pavel
25f199dd89
Merge pull request #43332 from Avogar/csv-custom-delimiter
Improve reading CSV field in CustomSeparated/Template format
2022-12-15 21:03:29 +01:00
Kruglov Pavel
c5b2e4cc23
Merge branch 'master' into improve-streaming-engines 2022-12-15 18:44:35 +01:00
avogar
f19afbc03e Fix fasttest 2022-12-13 12:59:27 +00:00
avogar
739ad23b1f Make better, fix bugs, improve error messages 2022-12-12 22:00:45 +00:00
Dmitry Novik
3d2fccab87
Merge branch 'master' into refector-function-node 2022-12-12 21:36:39 +01:00
avogar
f3e37c2c9b Merge branch 'refactor-schema-inference' of github.com:Avogar/ClickHouse into refactor-schema-inference 2022-12-12 14:47:04 +00:00
Kruglov Pavel
a03549df28
Apply suggestions from code review
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2022-12-12 15:46:03 +01:00
avogar
cd4fa00d2c Merge branch 'master' of github.com:ClickHouse/ClickHouse into refactor-schema-inference 2022-12-09 14:45:10 +00:00
avogar
1ec5f8451b Merge branch 'master' of github.com:ClickHouse/ClickHouse into csv-custom-delimiter 2022-12-08 19:17:42 +00:00
avogar
d0f9bb2ec2 Allow to parse JSON objects into Strings 2022-12-08 18:58:18 +00:00
Yakov Olkhovskiy
0641066183
Merge branch 'master' into refactoring-ip-types 2022-12-08 11:12:05 -05:00
Kruglov Pavel
26ed850b2d
Fix typo 2022-12-07 23:00:11 +01:00
Yakov Olkhovskiy
bf9194f405 review suggestions 2022-12-07 21:29:17 +00:00
avogar
7375a7d429 Refactor and improve schema inference for text formats 2022-12-07 21:19:27 +00:00
Dmitry Novik
15b8c48ca9 Cleanup code 2022-12-02 19:15:26 +00:00
Dmitry Novik
2c70dbc76a Refactor FunctionNode 2022-12-02 19:15:26 +00:00
Vladimir C
7d6950d397
Revert "Temporary files evict fs cache" 2022-12-02 14:50:56 +01:00
Kruglov Pavel
c35b2a6495
Add a limit for string size in RowBinary format (#43842) 2022-12-02 13:57:11 +01:00
vdimir
816af3dc16
wip: temporary files evict fs cache 2022-12-01 11:49:25 +00:00
vdimir
98fe3c6c02
Temporary files evict fs cache 2022-12-01 11:49:17 +00:00
Anton Popov
fe5fff0347
Merge pull request #43329 from xiedeyantu/support_nested_column
s3 table function can support select nested column using {column_name}.{subcolumn_name}
2022-11-29 22:27:19 +01:00
Yakov Olkhovskiy
770b520ded
Merge branch 'master' into refactoring-ip-types 2022-11-28 08:50:19 -05:00
xiedeyantu
304b6ebf3a s3 table function can support select nested column using {column_name}.{subcolumn_name} 2022-11-23 23:36:12 +08:00
Kruglov Pavel
98d6b96c82
Merge pull request #42033 from mark-polokhov/BSONEachRow
Add BSONEachRow input/output format
2022-11-22 14:45:21 +01:00
Kruglov Pavel
49eed2a07c
Merge branch 'master' into better-capn-proto 2022-11-22 14:11:53 +01:00
avogar
db8126f9c5 Merge branch 'master' of github.com:ClickHouse/ClickHouse into csv-custom-delimiter 2022-11-21 13:49:14 +00:00
avogar
37e14dc091 Fix tests 2022-11-21 13:46:15 +00:00
avogar
ecdeff622b Add small improvements in CapnProto format 2022-11-18 20:13:00 +00:00
avogar
fcfdd73d17 Improve reading CSV field in CustomSeparated/Template format 2022-11-17 15:36:56 +00:00
Vitaly Baranov
ce81166c7e Fix style. 2022-11-16 01:35:11 +01:00
Yakov Olkhovskiy
813cb7fb0d merge master 2022-11-15 22:46:05 +00:00
avogar
4d993e653a Fix build and style 2022-11-15 13:06:24 +00:00
avogar
842d25c358 Minor improvements, better docs 2022-11-14 20:05:01 +00:00
Vitaly Baranov
8e99f5fea3 Move maskSensitiveInfoInQueryForLogging() to src/Parsers/ 2022-11-14 18:55:19 +01:00
Yakov Olkhovskiy
9aeebf3bdf
Merge branch 'master' into refactoring-ip-types 2022-11-14 09:21:54 -05:00
Yakov Olkhovskiy
4d144be39c replace domain IP types (IPv4, IPv6) with native 2022-11-14 14:17:17 +00:00
avogar
564d83bbc7 Better handle uint64 2022-11-11 13:24:12 +00:00
avogar
88636b0f5b Fix style 2022-11-11 12:41:16 +00:00
avogar
4f05045726 Fix build 2022-11-11 11:41:14 +00:00
avogar
cd36caf013 Fix style 2022-11-10 20:37:24 +00:00
avogar
9e89af28c6 Refactor BSONEachRow format, fix bugs, support more data types, support parallel parsing and schema inference 2022-11-10 20:15:14 +00:00
Kruglov Pavel
b124875257
Merge branch 'master' into improve-streaming-engines 2022-11-03 13:22:06 +01:00
avogar
7cc87679e4 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BSONEachRow 2022-11-02 19:47:42 +00:00
Mark Polokhov
2fff4887ac Add BSON input/output format 2022-11-02 19:39:14 +00:00
avogar
774a86021f Fix datetime schema inference in case of empty string 2022-11-02 19:18:34 +00:00
Kruglov Pavel
38124b6533
Merge pull request #42780 from Avogar/parallel-parsing
Support parallel parsing for LineAsString input format
2022-11-02 13:21:53 +01:00
Anton Popov
876dca48da
Merge pull request #36969 from CurtizJ/dynamic-columns-14
Support `Object` type inside other types
2022-11-01 15:20:02 +01:00
Anton Popov
2ae3cfa9e0
Merge branch 'master' into dynamic-columns-14 2022-10-31 16:15:19 +01:00
avogar
fe0aea2e3a Support parallel parsing for LineAsString input format 2022-10-28 21:56:09 +00:00
Kruglov Pavel
781a27edb3
Remove write callback defenition 2022-10-28 19:46:52 +02:00
avogar
8e13d1f1ec Improve and refactor Kafka/StorageMQ/NATS and data formats 2022-10-28 16:41:10 +00:00
Kruglov Pavel
e099817449
Merge branch 'master' into Avogar-patch-3 2022-10-27 12:46:18 +02:00
Azat Khuzhin
56bc85746f Merge remote-tracking branch 'upstream/master' into build/shorten-64-to-32
Conflicts:
- src/Interpreters/ProcessList.cpp
2022-10-22 16:49:08 +02:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Alexey Milovidov
ff26251477 Merge branch 'master' into fix-race-condition-finish-cancel 2022-10-21 04:14:21 +02:00
Kruglov Pavel
867bcdbb1c
Fix typo in setting name that led to bad usage of schema inference cache 2022-10-20 16:46:25 +02:00
Alexander Tokmakov
68c18abfbb
Merge pull request #42406 from ClickHouse/template_format_better_error
Better error message for unsupported delimiters in custom formats
2022-10-20 15:52:08 +03:00
Alexey Milovidov
dfa202a15d Merge branch 'master' into fix-race-condition-finish-cancel 2022-10-19 02:35:42 +02:00
Kruglov Pavel
25e13bdd2f
Merge pull request #41107 from Avogar/improve-combinators
Support all combinators combination in WindowTransform/arratReduce*/initializeAggregation/aggregate functions versioning
2022-10-18 15:24:49 +02:00
Kruglov Pavel
8af95a6fc2
Merge pull request #41912 from Avogar/better-datetime-inference
Improve DateTime type inference for text formats
2022-10-18 15:23:59 +02:00
Alexander Tokmakov
fffecbb9ad better error message for unsupported delimiters in custom formats 2022-10-17 18:08:52 +02:00
Alexey Milovidov
f88ed8195b Fix trash 2022-10-17 04:21:08 +02:00
Kruglov Pavel
6fc12dd922
Merge pull request #41703 from Avogar/json-object-each-row
Add setting to obtain object name as column value in JSONObjectEachRow format
2022-10-14 20:11:04 +02:00
avogar
52427e6028 Remove code duplication 2022-10-14 18:07:02 +00:00
Kruglov Pavel
ff11904850
Merge branch 'master' into improve-combinators 2022-10-14 17:19:31 +02:00
Alexander Tokmakov
4175f8cde6 abort instead of __builtin_unreachable in debug builds 2022-10-07 21:49:08 +02:00
Anton Popov
6e61cf92f5 Merge remote-tracking branch 'upstream/master' into HEAD 2022-10-03 13:16:57 +00:00
Robert Schulze
db5ef7b3cb
Merge branch 'master' into generated-file-cleanup 2022-10-02 23:13:18 +02:00
Vitaly Baranov
f65d3ff95a Fix parallel parsing: segmentator now checks max_block_size. 2022-09-30 22:34:03 +02:00
Robert Schulze
cc92a2d174
Merge branch 'master' into generated-file-cleanup 2022-09-30 09:56:31 +02:00
vdimir
0f1a7c252d
better TemporaryDataOnDisk 2022-09-29 09:51:46 +00:00
vdimir
efe0f99658
Fix reading block info in NativeReader with header in ctor 2022-09-29 09:51:44 +00:00
vdimir
ac39bbb3f1
[wip] Common interface for temporary data on disk 2022-09-29 09:51:40 +00:00
avogar
e16cfd361b Improve DateTime type inference for text formats 2022-09-28 16:55:42 +00:00
Robert Schulze
09c62f6728
Consolidate config_formats.h into config.h
Less duplication, less confusion ...
2022-09-28 12:59:05 +00:00
Kruglov Pavel
6340369c2a
Merge branch 'master' into improve-combinators 2022-09-28 14:55:30 +02:00
Robert Schulze
78fc36ca49
Generate config.h into ${CONFIG_INCLUDE_PATH}
This makes the target location consistent with other auto-generated
files like config_formats.h, config_core.h, and config_functions.h and
simplifies the build of clickhouse_common.
2022-09-28 12:48:26 +00:00
Kruglov Pavel
f1ac2d66be
Merge branch 'master' into json-object-each-row 2022-09-28 14:15:02 +02:00
Robert Schulze
06507c40de
${ConfigIncludePath} --> ${CONFIG_INCLUDE_PATH} 2022-09-28 08:28:47 +00:00
Robert Schulze
1885bb0524
Make comment consistent accross generated files 2022-09-28 08:11:09 +00:00
avogar
76be0d2ee1 Infer Object type only when allow_experimental_object_type is enabled 2022-09-27 23:07:36 +00:00
Kruglov Pavel
3dc54272ed
Merge branch 'master' into improve-combinators 2022-09-26 13:03:32 +02:00
avogar
d3d06251a3 Add setting to obtain object name as column value in JSONObjectEachRow format 2022-09-22 16:48:54 +00:00
Alexey Milovidov
45afacdae4
Merge pull request #41186 from ClickHouse/fix-three-fourth-of-trash
Fix more than half of the trash
2022-09-22 07:28:26 +03:00
Kruglov Pavel
22e11aef2d
Merge pull request #40910 from Avogar/new-json-formats
Add new JSON formats, add improvements and refactoring
2022-09-21 14:19:08 +02:00
avogar
868ce8bc16 Fix comments, make better naming, add docs, add setting output_format_json_quote_64bit_floats 2022-09-20 13:49:17 +00:00
Kruglov Pavel
47f6f09ce0
Merge branch 'master' into improve-combinators 2022-09-19 14:31:12 +02:00
Alexey Milovidov
84f42e0874 Fix 3/4 of trash 2022-09-19 08:50:53 +02:00
Alexey Milovidov
2f0684b97c Fix trash in schema inference 2022-09-17 23:11:33 +02:00
Alexey Milovidov
47167494d9 Fix trash in schema inference 2022-09-17 22:53:41 +02:00
avogar
0101cc2e56 Support complex combinators in window transform, arrayReduce*, initializeAggregation and Aggregate functons versionning 2022-09-16 19:07:36 +00:00
Alexey Milovidov
da01982652
Merge pull request #41046 from azat/build/llvm-15
Switch to llvm/clang 15
2022-09-16 07:31:06 +03:00
Azat Khuzhin
e8d7403a38 Suppress warning in FormatFactory::getFormatFromFileDescriptor() for FreeBSD
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-09-10 21:38:35 +02:00
zhenjial
bd9fabc3f7 code optimization, add test 2022-09-09 23:27:42 +08:00
avogar
ad68b7be0f Better 2022-09-09 15:01:45 +00:00
avogar
46a0318a36 Support JSONColumnsWithMetadata input format 2022-09-08 17:58:44 +00:00
zhenjial
469ceaa156 code optimization 2022-09-09 00:47:43 +08:00
avogar
c380decbbb Make better, add new settings 2022-09-08 16:07:20 +00:00
Anton Popov
86b29b7f1a fix serilization of Object inside other types 2022-09-08 15:16:39 +00:00
zhenjial
0f788d98f5 new implementation 2022-09-06 20:39:54 +08:00
avogar
b94e896c1c Remove logs 2022-09-01 19:01:27 +00:00
avogar
afc34dca41 Add new JSON formats, add improvements and refactoring 2022-09-01 19:00:24 +00:00
avogar
acf87c1d10 Fix nested JSON Objects schema inference 2022-08-31 14:10:29 +00:00
vdimir
0349c85017
Use getCompressedBytes in BufferingToFileTransform and TemporaryFileStream 2022-08-24 16:14:10 +00:00
vdimir
51c44424cc
More metrics for temp files 2022-08-24 16:14:09 +00:00
avogar
29a887578b Fix 2022-08-23 11:42:57 +00:00
avogar
5ab87f1da4 Small refactoring 2022-08-19 16:42:23 +00:00
avogar
612ffaffde Make schema inference cache better, respect format settings that can change the schema 2022-08-19 16:39:13 +00:00
Nikolai Kochetov
5a85531ef7
Merge pull request #38286 from Avogar/schema-inference-cache
Add schema inference cache for s3/hdfs/file/url
2022-08-18 13:07:50 +02:00
avogar
8dd54c043d Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-cache 2022-08-17 11:47:40 +00:00
avogar
e1ff996ec3 Allow to specify structure hints in schema inference 2022-08-16 09:46:57 +00:00
Kruglov Pavel
088e8cf9bd
Merge branch 'master' into numbers-schema-inference 2022-08-09 14:00:36 +02:00