Commit Graph

360 Commits

Author SHA1 Message Date
avogar
e2d7c543eb Don't try to infer numbers from strings in JSON formats by default to avoid parsing errors, add docs for setting input_format_json_try_infer_numbers_from_strings 2023-09-28 22:15:26 +00:00
Kruglov Pavel
b6863a9f52
Fix comments 2023-09-26 14:13:34 +02:00
avogar
42ca897f2d Better schema inference for JSON formats 2023-09-25 15:42:59 +00:00
Kruglov Pavel
c68456a20a
Merge pull request #52692 from Avogar/variable-number-of-volumns-more-formats
Allow variable number of columns in more formats, make it work with schema inference
2023-08-21 13:28:35 +02:00
Michael Kolupaev
a1522e22ea
Merge pull request #53281 from Avogar/batch-small-parquet-row-groups
Optimize reading small row groups by batching them together in Parquet
2023-08-18 17:15:42 -07:00
avogar
bca91548ad Add setting input_format_parquet_local_file_min_bytes_for_seek 2023-08-17 12:28:01 +00:00
avogar
7e863a2726 Address comments 2023-08-11 13:17:49 +00:00
avogar
3ad7e57059 Optimize reading small row groups by batching them together in Parquet 2023-08-11 13:17:45 +00:00
Kruglov Pavel
33a39900ad
Merge branch 'master' into variable-number-of-volumns-more-formats 2023-08-09 19:51:17 +02:00
avogar
98435657cb Clean up 2023-08-09 11:28:09 +00:00
avogar
01a7c7560f Add input format One 2023-08-09 11:25:32 +00:00
Anton Popov
ff137773e7
Merge branch 'master' into formats-with-subcolumns 2023-08-02 15:24:56 +02:00
Kruglov Pavel
3e1c409e60
Merge branch 'master' into structure-to-schema 2023-07-28 11:32:16 +02:00
avogar
b9c9933cc9 Fix typo 2023-07-27 18:56:23 +00:00
avogar
67b0993bdf Add documentation 2023-07-27 18:54:41 +00:00
avogar
6d77d52dfe Allow variable number of columns in TSV/CuatomSeprarated/JSONCompactEachRow, make schema inference work with variable number of columns 2023-07-27 18:02:29 +00:00
Kruglov Pavel
0d34e97dbe
Merge branch 'master' into formats-with-subcolumns 2023-07-26 13:30:35 +02:00
Kruglov Pavel
342400d0b3
Merge branch 'master' into revert-52322-revert-51716-bug_fix_csv_field_type_not_match 2023-07-20 12:39:38 +02:00
Kruglov Pavel
f0026af189
Revert "Revert "Improve CSVInputFormat to check and set default value to column if deserialize failed"" 2023-07-19 14:51:11 +02:00
Kruglov Pavel
7b3564f96a
Revert "Improve CSVInputFormat to check and set default value to column if deserialize failed" 2023-07-19 14:44:59 +02:00
kevinyhzou
94796f28ad ci fix 2023-07-19 19:24:16 +08:00
kevinyhzou
95424177d5 review fix 2023-07-19 18:26:54 +08:00
Kruglov Pavel
1e616e17ab
Merge branch 'master' into row-binary-with-defaults 2023-07-17 19:13:57 +02:00
Kruglov Pavel
1dd05319b5
Merge branch 'master' into formats-with-subcolumns 2023-07-17 19:13:42 +02:00
kevinyhzou
355faa4251 ci fix 2023-07-17 20:08:32 +08:00
kevinyhzou
b2665031dc review fix 2023-07-13 20:27:14 +08:00
kevinyhzou
ba57c84db3 bug fix csv input field type mismatch 2023-07-13 20:24:10 +08:00
Dmitry Kardymon
32f5a78302 Fix setting name 2023-07-06 07:32:46 +00:00
Dmitry Kardymon
24b5c9c204 Use one setting input_format_csv_allow_variable_number_of_colums and code in RowInput 2023-07-06 06:05:43 +00:00
Kruglov Pavel
a2805f8f44
Merge branch 'master' into formats-with-subcolumns 2023-07-04 23:27:03 +02:00
avogar
3dc4ff1760 Remove obsolete settings 2023-07-04 21:21:22 +00:00
Dmitry Kardymon
ab4142eb8f Merge remote-tracking branch 'clickhouse/master' into ADQM-870 2023-07-04 08:23:31 +03:00
avogar
34bf0284ad Add RowBinaryWithDefaults format 2023-06-30 16:18:30 +00:00
Nikifor Seriakov
5a39960e03
Update docs/en/interfaces/formats.md
Fixed RawBLOB comparison lists formatting.
2023-06-27 21:32:39 +04:00
Dmitry Kardymon
dbced8a30c Merge remote-tracking branch 'origin/master' into ADQM-870 2023-06-22 19:49:06 +00:00
Dan Roscigno
c856c4a7df
Merge branch 'master' into Docs/ip_addresses 2023-06-21 17:26:27 -04:00
Dmitry Kardymon
dd43a186ad Minor edit docs / add int256 test 2023-06-19 09:51:29 +00:00
Dmitry Kardymon
30bea857fd Merge remote-tracking branch 'origin/master' into ADQM-870 2023-06-19 07:19:07 +00:00
Kruglov Pavel
38ed92c8f4
Update Avro format docs 2023-06-16 15:53:29 +02:00
Dmitry Kardymon
806176d88e Add input_format_csv_missing_as_default setting and tests 2023-06-15 11:23:08 +00:00
KevinyhZou
953f40aa3b
Merge branch 'master' into bug_fix_csv_parse_by_tab_delimiter 2023-06-15 10:25:19 +08:00
Denny Crane
fd01cb7bec
Merge branch 'master' into Docs/ip_addresses 2023-06-14 17:38:48 -03:00
Dmitry Kardymon
a91fc3ddb3 Add docs/ add more cases in test 2023-06-14 16:44:31 +00:00
kevinyhzou
f3b99156ac review fix 2023-06-14 10:48:21 +08:00
kevinyhzou
911f8ad8dc use whitespace or tab as field delimiter 2023-06-12 11:57:52 +08:00
Kruglov Pavel
1baa6404e6
Merge branch 'master' into skip-trailing-empty-lines 2023-06-06 19:39:34 +02:00
avogar
df50833b70 Allow to skip trailing empty lines in CSV/TSV/CustomeSeparated formats 2023-06-06 17:33:05 +00:00
Dan Roscigno
f691fe787b
Merge branch 'master' into Docs/ip_addresses 2023-06-06 09:12:31 -04:00
Denny Crane
2cc457141e clean documentation of ip4 ip6 from domains 2023-06-04 15:32:54 -03:00
Alexey Gerasimchuck
75791d7a63 Added input_format_csv_trim_whitespaces parameter 2023-05-25 07:51:32 +00:00