Commit Graph

90 Commits

Author SHA1 Message Date
Raúl Marín
de855ca917 Reduce header dependencies 2024-03-19 17:04:29 +01:00
HowePa
dbd8d35f01 use lower case in dict 2024-02-27 00:48:34 +08:00
HowePa
0b72f7b182 Make all format names case insensitive. 2024-02-26 22:46:51 +08:00
avogar
617cc514b7 Try to detect file format automatically during schema inference if it's unknown 2024-01-23 18:59:39 +00:00
kevinyhzou
3adc8fdf78 Fix ci 2023-11-21 11:22:12 +08:00
Michael Kolupaev
ce7eca0615
DWARF input format (#55450)
* Add ReadBufferFromFileBase::isRegularLocalFile()

* DWARF input format

* Review comments

* Changed things around ENABLE_EMBEDDED_COMPILER build setting

* Added 'ranges' column

* no-msan no-ubsan
2023-10-16 17:00:07 -07:00
avogar
b5cccc5f8d Remove unused field 2023-09-11 14:58:02 +00:00
avogar
2d8f33bfa2 Fix parsing error in WithNames formats while reading subset of columns with disabled input_format_with_names_use_header 2023-09-11 14:55:37 +00:00
Michael Kolupaev
2f4d433e69 Parquet filter pushdown 2023-08-21 14:15:52 -07:00
avogar
98aa6b317f Support reading subcolumns from file/s3/hdfs/url/azureBlobStorage table functions 2023-07-04 21:17:26 +00:00
Michael Kolupaev
2498170253 Fix use-after-free in StorageURL when switching URLs 2023-06-22 16:24:12 +00:00
Michael Kolupaev
b51064a508 Get rid of SeekableReadBufferFactory, add SeekableReadBuffer::readBigAt() instead 2023-06-01 18:48:30 -07:00
Michael Kolupaev
3bd1489f18 Propagate input_format_parquet_preserve_order to parallelizeOutputAfterReading() 2023-05-05 04:20:27 +00:00
Michael Kolupaev
eb3b774ad0 Better control over Parquet row group size 2023-05-04 14:59:55 -07:00
Michael Kolupaev
87be78e6de Better 2023-04-17 04:58:32 +00:00
Michael Kolupaev
e133633359 Parallel decoding with one row group per thread 2023-04-17 04:58:32 +00:00
Michael Kolupaev
683077890f Highly questionable refactoring (getInputMultistream() nonsense) 2023-04-17 04:58:32 +00:00
Michael Kolupaev
2d4fe85513 Something 2023-04-17 04:58:32 +00:00
Alexey Milovidov
f33b651686 Add fuzzer for data formats 2023-03-13 04:51:50 +01:00
Kruglov Pavel
c5b2e4cc23
Merge branch 'master' into improve-streaming-engines 2022-12-15 18:44:35 +01:00
xiedeyantu
304b6ebf3a s3 table function can support select nested column using {column_name}.{subcolumn_name} 2022-11-23 23:36:12 +08:00
Kruglov Pavel
781a27edb3
Remove write callback defenition 2022-10-28 19:46:52 +02:00
avogar
8e13d1f1ec Improve and refactor Kafka/StorageMQ/NATS and data formats 2022-10-28 16:41:10 +00:00
Vitaly Baranov
f65d3ff95a Fix parallel parsing: segmentator now checks max_block_size. 2022-09-30 22:34:03 +02:00
avogar
5155262a16 Add some additional information to cache keys 2022-06-27 12:43:24 +00:00
avogar
f782fa31c6 Merge branch 'master' of github.com:ClickHouse/ClickHouse into check-format-on-storage-creation 2022-05-25 08:42:54 +00:00
avogar
37b66c8a9e Check format name on storage creation 2022-05-23 12:48:48 +00:00
avogar
a4cf07708c Fix comments 2022-05-20 14:57:27 +00:00
avogar
68bb07d166 Better naming 2022-05-13 18:39:19 +00:00
avogar
b17fec659a Improve performance and memory usage for select of subset of columns for some formats 2022-05-13 13:51:28 +00:00
Anton Popov
e911900054 remove last mentions of data streams 2022-05-09 19:15:24 +00:00
avogar
1c065f8c7a Some refactoring around schema inference with globs 2022-04-13 17:02:48 +00:00
avogar
557edbd172 Add some improvements and fixes in schema inference 2022-03-24 12:54:12 +00:00
Kruglov Pavel
7873b4475f
Merge branch 'master' into autodetect-format 2022-01-25 10:56:52 +03:00
avogar
a6740d2f9a Detect format and schema for stdin in clickhouse-local 2022-01-25 10:25:37 +03:00
avogar
1f49acc164 Better naming 2022-01-24 16:28:36 +03:00
Kruglov Pavel
a7df9cd53a
Merge branch 'master' into formats-with-suffixes 2022-01-14 21:03:49 +03:00
avogar
89a181bd19 Make better 2022-01-14 18:16:18 +03:00
Kruglov Pavel
5a908e8edd
Merge branch 'master' into formats-with-suffixes 2022-01-14 16:45:20 +03:00
avogar
2d7b1bfa5e Detect format in S3/HDFS/URL table engines 2022-01-13 16:14:18 +03:00
zhongyuankai
878e44eb97 auto format by file extension 2022-01-08 21:47:14 +08:00
avogar
97788b9c21 Allow to create new files on insert for File/S3/HDFS engines 2021-12-29 21:19:13 +03:00
avogar
8112a71233 Implement schema inference for most input formats 2021-12-29 12:18:56 +03:00
frank chen
898db5b468 Resolve review comments
Signed-off-by: frank chen <frank.chen021@outlook.com>
2021-12-03 19:47:05 +08:00
cgp
18504f545a move InputCreatorFunc to InputCreator 2021-11-12 00:34:59 +08:00
avogar
872cca550a Make better 2021-10-20 15:47:20 +03:00
mergify[bot]
0a4360c43e
Merge branch 'master' into tsv-csv 2021-10-20 11:57:06 +00:00
avogar
7007286088 Fix WithNamesAndTypes parallel parsing, add new tests, small refactoring 2021-10-20 14:48:54 +03:00
Nikolai Kochetov
a92dc0a826 Update obsolete comments. 2021-10-19 12:58:10 +03:00
Azat Khuzhin
50231460af Use forward declaration for Buffer<> in generic headers
- changes in ReadHelpers.h -- recompiles 1000 modules
- changes in FormatFactor.h -- recompiles 100 modules
2021-10-16 12:03:24 +03:00