Commit Graph

1253 Commits

Author SHA1 Message Date
Blargian
37e03ef320 Modify pretty formats to display column names in the footer when row count is large 2024-06-12 07:52:50 +02:00
Blargian
5aa9389f85 Add failing test, setting and docuumentation 2024-06-11 15:13:36 +02:00
LiuNeng
0ca96559c2
Merge branch 'master' into adapting-parquet-block-size 2024-06-06 11:14:01 +08:00
Robert Schulze
ec3b82ba63
Merge pull request #64606 from rschu1ze/map-stuff
Double-checking #59318 and docs for `Map`
2024-06-05 07:56:29 +00:00
Azat Khuzhin
918d3849e1 Simplify logic for input_format_try_infer_integers
Now, when we can be sure that it is a float, parse it as a float, and
fallback to int/uint after.

But note, that this would break something if tryReadFloat() !=
tryReadIntText() + parsing of '.'/'e', but for now, it is true.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-06-03 12:09:47 +02:00
Amos Bird
b2d6610d5f
Support empty tuple. 2024-06-03 16:05:42 +08:00
Azat Khuzhin
5246c56a2a Fix type inference for float (in case of small buffer)
In case of small buffer (i.e. --max_read_buffer_size 1) the pos() will
be always point to this one byte, so, comparing pos() will be always
evaluated to true.

And we cannot use count() as well, since in case of big buffer it will
be the same, plus, in case of reading extra byte for checking for '.'
the count() will be different, but it does not mean that the byte had
been interpreted (and allowing 1 byte of difference will not work almost
always, since it will read max_read_buffer_size bytes).

So instead, expose the has_fractional flag from the read helpers for
float, via two new methods:
- tryReadFloatTextExt
- tryReadFloatTextExtNoExponent

Where "ext" stands for "extended", which means expose extra information.

v2: consider number as float if it has '.' or 'e' (previously only if it
has some signs after those two it had been considered as float)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-06-01 16:34:55 +02:00
Robert Schulze
8fc358f427
Merge remote-tracking branch 'rschu1ze/master' into map-stuff 2024-05-31 11:19:57 +00:00
Robert Schulze
b0c955e9c9
Various stuff 2024-05-29 20:51:48 +00:00
Robert Schulze
18d432f44c
Reapply "Remove some unnecessary UNREACHABLEs"
This reverts commit 5a868304c0.
2024-05-29 13:37:47 +00:00
liuneng
3bd3717d34 revert setting rename 2024-05-29 10:24:42 +08:00
liuneng
b4c2fa7e27 add test case 2024-05-28 15:17:08 +08:00
Alexander Tokmakov
5a868304c0
Revert "Remove some unnecessary UNREACHABLEs" 2024-05-27 11:38:22 +02:00
liuneng
b30d11f046 adapting parquet reader output block rows 2024-05-27 16:21:31 +08:00
Robert Schulze
7a552f5b06
Merge pull request #64035 from rschu1ze/unreachable-unreachable
Remove some unnecessary `UNREACHABLE`s
2024-05-26 20:37:17 +00:00
Michael Kolupaev
ee3e7f2fd0
Merge pull request #60361 from copperybean/gcmaster-parquet
A native parquet reader for primitive types
2024-05-24 04:50:12 +00:00
Kruglov Pavel
30dce7821c
Merge pull request #63058 from Avogar/dynamic-data-type
Implement Dynamic data type
2024-05-23 14:19:46 +00:00
Robert Schulze
f792a602da
Merge remote-tracking branch 'rschu1ze/master' into unreachable-unreachable 2024-05-22 21:08:27 +00:00
ZhiHong Zhang
3d7befef4f
Merge branch 'master' into gcmaster-parquet 2024-05-22 23:31:00 +08:00
Kruglov Pavel
fddedee9a9
Merge pull request #59747 from Blargian/56257_parse_crlf_with_TSV_files
Parse CRLF with TSV files
2024-05-22 13:45:07 +00:00
Kruglov Pavel
4989109e13
Merge pull request #63662 from v01dXYZ/63496-compression-from-file-descriptor
Compress STDOUT if redirected to file with a compression extension
2024-05-22 13:37:05 +00:00
Robert Schulze
0d3aeddc93
Merge remote-tracking branch 'rschu1ze/master' into unreachable-unreachable 2024-05-22 07:25:00 +00:00
ZhiHong Zhang
4b1c9adb3a
Merge branch 'ClickHouse:master' into gcmaster-parquet 2024-05-22 09:33:01 +08:00
avogar
6bba847b7d Merge branch 'master' of github.com:ClickHouse/ClickHouse into dynamic-data-type 2024-05-21 09:08:24 +00:00
Robert Schulze
2909e6451b
Move StringUtils.h/cpp back to Common/ 2024-05-19 09:39:36 +00:00
Robert Schulze
9969f9cf30
Merge remote-tracking branch 'rschu1ze/master' into unreachable-unreachable 2024-05-19 08:26:39 +00:00
Kruglov Pavel
c861ac4858
Merge branch 'master' into dynamic-data-type 2024-05-17 22:17:41 +02:00
Robert Schulze
53e992af4f
Remove some unnecessary UNREACHABLEs 2024-05-17 11:46:07 +00:00
Raúl Marín
7e429482fc Revert "Merge pull request #63479 from yariks5s/add_setting_from_multiline_strings"
This reverts commit 962d5e5bda, reversing
changes made to 8c4a5d3663.
2024-05-16 12:55:26 +02:00
Kruglov Pavel
4cfe2665de
Update src/Formats/FormatSettings.h 2024-05-15 20:28:17 +02:00
Kruglov Pavel
413be14c43
Merge branch 'master' into dynamic-data-type 2024-05-15 13:43:04 +02:00
Shaun Struwig
47ab2e2dc5
Merge branch 'ClickHouse:master' into 56257_parse_crlf_with_TSV_files 2024-05-15 05:28:18 +02:00
avogar
a7e87e22ad Merge branch 'master' of github.com:ClickHouse/ClickHouse into 56257_parse_crlf_with_TSV_files 2024-05-14 11:56:43 +00:00
豪肥肥
72fa329808
Merge branch 'ClickHouse:master' into output_format_npy 2024-05-14 11:07:46 +08:00
v01dxyz
8e63d2f795 Compress STDOUT if redirected to file with a compression extension
* Add a new member to ClientBase: default_output_compression_method
* Move the code to get file path from file descriptor to a separate
  Common function.

The stateless test is almost a copy-paste of 02001_compress_output_file.

Fixes https://github.com/ClickHouse/ClickHouse/issues/63496
2024-05-13 09:21:01 +02:00
copperybean
dbdff6c038 support reading simple types by native parquet reader
Change-Id: I38b8368b022263d9a71cb3f3e9fdad5d6ca26753
2024-05-11 15:51:58 +08:00
Kruglov Pavel
6207f6f4a5
Merge branch 'master' into dynamic-data-type 2024-05-10 13:42:38 +02:00
Alexey Milovidov
dd58af7d4f Merge branch 'master' of github.com:ClickHouse/ClickHouse into clang-18-ci 2024-05-10 07:17:39 +02:00
Alexey Milovidov
a6543a1c48 Useless changes 2024-05-10 05:07:37 +02:00
Alexey Milovidov
426a51b624 Useless changes 2024-05-10 04:53:29 +02:00
Alexey Milovidov
42710158e4 Merge branch 'master' into clang-18-ci 2024-05-10 02:38:20 +02:00
Michael Kolupaev
1b43c58489
Merge pull request #62087 from ClickHouse/checkmate
Avoid crashing on column type mismatch in a few dozen places
2024-05-09 23:59:59 +00:00
Alexey Milovidov
95f12ef274 Useless changes 2024-05-09 01:08:33 +02:00
Yarik Briukhovetskyi
cd3a60bfb4
Merge branch 'master' into add_setting_from_multiline_strings 2024-05-08 12:21:08 +02:00
豪肥肥
d6e171e1a4
Merge branch 'master' into output_format_npy 2024-05-08 12:20:10 +08:00
Constantine Peresypkin
07472b3e95 Add setting to force NULL for omitted fields
Fixes #60884
2024-05-07 11:28:44 -04:00
yariks5s
5117422c7b init 2024-05-07 14:48:50 +00:00
Michael Kolupaev
d14fc62d4d Avoid crashing on column type mismatch in a few dozen places 2024-05-06 22:09:02 +00:00
Shaun Struwig
3316df88aa
Merge branch 'ClickHouse:master' into 59557_form_input_format 2024-05-05 05:30:40 +02:00
Shaun Struwig
2b4def9ebb
Merge branch 'ClickHouse:master' into 56257_parse_crlf_with_TSV_files 2024-05-05 05:30:26 +02:00
Alexey Milovidov
989a880230
Merge pull request #62404 from Avogar/trivial-insert-select-from-files
Improve trivial insert select from files, add separate max_parsing_threads setting
2024-04-30 01:57:56 +02:00
Shaun Struwig
a69658b1dd
Merge branch 'ClickHouse:master' into 59557_form_input_format 2024-04-27 21:40:58 +02:00
avogar
ff12caf2e9 Merge branch 'master' of github.com:ClickHouse/ClickHouse into dynamic-data-type 2024-04-26 11:08:04 +00:00
avogar
69a3aa7bcf Implement Dynamic data type 2024-04-26 11:02:33 +00:00
HowePa
5e8bc4402a unified NumpyDataTypes 2024-04-25 15:52:30 +08:00
kevinyhzou
7c9dbdbd9c Improve json read by ignore key case 2024-04-25 12:21:32 +08:00
Kruglov Pavel
52e3c3aa4e
Merge branch 'master' into 56257_parse_crlf_with_TSV_files 2024-04-24 16:20:19 +01:00
Raúl Marín
0d06d69377 Fix parsing of nested proto messages 2024-04-24 13:32:13 +02:00
lgbo-ustc
d4773ef1bb Merge remote-tracking branch 'origin/master' into json_format_early_skip 2024-04-17 08:56:54 +08:00
Robert Schulze
3c35f14804
Merge remote-tracking branch 'ClickHouse/master' into mkmkme/protobuf-25.1 2024-04-15 12:38:59 +00:00
Kruglov Pavel
ce7432424e
Merge branch 'master' into trivial-insert-select-from-files 2024-04-12 14:26:48 +02:00
Alexander Tokmakov
b5ff1c0a6e
Merge branch 'master' into cannot_allocate_thread 2024-04-12 13:35:14 +02:00
lgbo-ustc
e9635189d2 Merge remote-tracking branch 'origin/master' into json_format_early_skip 2024-04-12 08:53:38 +08:00
lgbo-ustc
31a3217355 update settings 2024-04-12 08:52:28 +08:00
lgbo-ustc
a87fb7dc84 Merge remote-tracking branch 'origin/master' into json_format_early_skip 2024-04-11 14:37:12 +08:00
lgbo-ustc
64e47cca9a add settings 2024-04-11 14:36:25 +08:00
Alexander Tokmakov
d8e97b51bf Merge branch 'master' into cannot_allocate_thread 2024-04-10 21:21:42 +02:00
Raúl Marín
d6260e984c Avoid crash when reading protobuf with recursive types 2024-04-10 19:46:52 +02:00
Kruglov Pavel
7a3bfb31e8
Merge pull request #62086 from KevinyhZou/improve_hive_text_read_by_replace_settings
Improve hive text read by allow variable number of fields
2024-04-10 12:49:59 +00:00
HowePa
c0174fa17e [feature] add npy output format 2024-04-09 14:30:14 +08:00
Shaun Struwig
971263c4e6
Merge branch 'ClickHouse:master' into 56257_parse_crlf_with_TSV_files 2024-04-08 21:52:47 +02:00
Shaun Struwig
dde99cbe8c
Merge branch 'ClickHouse:master' into 59557_form_input_format 2024-04-08 21:51:57 +02:00
Kruglov Pavel
c2d432be20
Merge branch 'master' into trivial-insert-select-from-files 2024-04-08 15:57:28 +02:00
avogar
ed6e4fbe16 Improve trivial insert select from files, add separate max_parsing_threads setting 2024-04-08 13:56:15 +00:00
Alexander Tokmakov
5db9fbed52 cancel tasks on exception 2024-04-04 22:32:57 +02:00
Shaun Struwig
05b2cfb563
Merge branch 'ClickHouse:master' into 59557_form_input_format 2024-04-04 14:26:42 +02:00
Raúl Marín
276246ee97 Introduce IAggregateFunction_fwd to reduce header dependencies 2024-04-04 12:29:54 +02:00
Kruglov Pavel
05db73f518
Merge branch 'master' into 56257_parse_crlf_with_TSV_files 2024-04-03 17:17:44 +02:00
Mikhail Koviazin
e7a664e9df
Merge remote-tracking branch 'upstream/master' into mkmkme/protobuf-25.1 2024-04-03 13:51:37 +03:00
Raúl Marín
c35a436435 Remove nested dependency on DateLutImpl 2024-04-02 14:45:48 +02:00
kevinyhzou
6018434f82 add config input_format_hive_text_allow_variable_number_of_columns 2024-04-02 19:37:23 +08:00
Kruglov Pavel
9b5b44dd5f
Merge pull request #61889 from Avogar/allow-to-save-bad-json-escape-sequences
Add a setting to allow saving bad escape sequences in JSON input formats
2024-03-28 14:34:02 +01:00
Yakov Olkhovskiy
257cdd83d4
Merge pull request #60994 from bigo-sg/csv-tuple
fix csv format not support tuple
2024-03-27 09:07:46 -04:00
Shaun Struwig
0e76731c6a
Merge branch 'ClickHouse:master' into 56257_parse_crlf_with_TSV_files 2024-03-27 03:06:51 +01:00
Kruglov Pavel
7220797637
Fix style 2024-03-26 15:26:42 +01:00
avogar
dc87c483dd Add a setting to allow saving bad escape sequences in JSON input formats 2024-03-25 21:58:53 +00:00
Alexey Milovidov
3e5ddddb35 Merge branch 'master' into dont-cut-single-value 2024-03-24 00:51:10 +01:00
Alexey Milovidov
4cbecd0bbd Add a setting 2024-03-23 04:20:52 +01:00
Alexey Milovidov
a2e89c8be7 Fix wrong cases of numbers pretty printing
Add a test

Revert changes from another branch

Add a test

Better test

Revert wrong changes
2024-03-23 03:33:03 +01:00
shuai-xu
9d5cabb26d fix csv format not support tuple 2024-03-22 16:51:58 +08:00
Raúl Marín
de855ca917 Reduce header dependencies 2024-03-19 17:04:29 +01:00
Shaun Struwig
01919f0bd3
Merge branch 'ClickHouse:master' into 56257_parse_crlf_with_TSV_files 2024-03-17 20:32:39 +01:00
Alexey Milovidov
01136bbc3b Limit backtracking in parser 2024-03-17 19:54:45 +01:00
Alexey Milovidov
0a3e42401c Fix fuzzers 2024-03-17 15:44:36 +01:00
avogar
feda83a7c8 Merge branch 'master' of github.com:ClickHouse/ClickHouse into 56257_parse_crlf_with_TSV_files 2024-03-14 17:44:38 +00:00
Shaun Struwig
f251a6d262
Merge branch 'ClickHouse:master' into 59557_form_input_format 2024-03-11 18:52:28 +01:00
Raúl Marín
9bada70f45 Remove a bunch of transitive dependencies 2024-03-11 14:52:32 +01:00
Mikhail Koviazin
490efd2efa
Fixes addressing review comments 2024-03-06 14:35:48 +02:00
Blargian
2ad8ab2a57 Fix linker errors 2024-03-05 19:13:20 +01:00
Shaun Struwig
beb0d08bdb
Merge branch 'ClickHouse:master' into 56257_parse_crlf_with_TSV_files 2024-03-05 14:09:01 +01:00