Alexey Milovidov
|
9ae685975e
|
Merge branch 'master' into avro-fix
|
2023-07-22 04:56:48 +03:00 |
|
Alexander Tokmakov
|
b45c2c939b
|
disable expression templates for time intervals (#52335)
|
2023-07-21 15:17:07 +03:00 |
|
ltrk2
|
90a2c460c6
|
Merge branch 'master' into feature/mergetree-checksum-big-endian-support
|
2023-07-21 08:07:18 -04:00 |
|
Kruglov Pavel
|
342400d0b3
|
Merge branch 'master' into revert-52322-revert-51716-bug_fix_csv_field_type_not_match
|
2023-07-20 12:39:38 +02:00 |
|
Nikolay Degterinsky
|
209429d0e3
|
Merge pull request #49664 from ilejn/test_for_basic_auth_registry
Basic auth to fetch Avro schema in Kafka
|
2023-07-20 10:58:11 +02:00 |
|
李扬
|
68bf4c3590
|
Merge branch 'master' into comment_improve_ch_to_arrow
|
2023-07-20 10:10:47 +08:00 |
|
ltrk2
|
a753c3c6ad
|
Merge branch 'master' into feature/mergetree-checksum-big-endian-support
|
2023-07-19 16:22:58 -04:00 |
|
Kruglov Pavel
|
0fca64ced4
|
Merge pull request #51695 from Avogar/row-binary-with-defaults
Add RowBinaryWithDefaults format
|
2023-07-19 22:10:30 +02:00 |
|
ltrk2
|
ba4072f049
|
Adapt changes around SipHash
|
2023-07-19 10:01:58 -07:00 |
|
ltrk2
|
51e2c58a53
|
Implement endianness-independent SipHash and MergeTree checksum serialization
|
2023-07-19 10:01:55 -07:00 |
|
Kruglov Pavel
|
f0026af189
|
Revert "Revert "Improve CSVInputFormat to check and set default value to column if deserialize failed""
|
2023-07-19 14:51:11 +02:00 |
|
Kruglov Pavel
|
7b3564f96a
|
Revert "Improve CSVInputFormat to check and set default value to column if deserialize failed"
|
2023-07-19 14:44:59 +02:00 |
|
robot-ch-test-poll4
|
63d0616a22
|
Merge pull request #51716 from KevinyhZou/bug_fix_csv_field_type_not_match
Improve CSVInputFormat to check and set default value to column if deserialize failed
|
2023-07-19 14:41:05 +02:00 |
|
kevinyhzou
|
dcf7ba2534
|
remove unuseful code
|
2023-07-19 19:36:19 +08:00 |
|
kevinyhzou
|
95424177d5
|
review fix
|
2023-07-19 18:26:54 +08:00 |
|
Ilya Golshtein
|
c1c5ffa309
|
test_for_basic_auth_registry - cpp code small improvement
|
2023-07-19 08:32:45 +00:00 |
|
dheerajathrey
|
8e1de7897a
|
indentation fix
|
2023-07-19 08:32:44 +00:00 |
|
dheerajathrey
|
1564eace38
|
enable url-encoded basic auth to fetch avro schema in kafka
|
2023-07-19 08:32:44 +00:00 |
|
Alexey Milovidov
|
0789f388c3
|
Update ArrowFieldIndexUtil.h
|
2023-07-19 02:45:56 +03:00 |
|
Alexey Milovidov
|
6d915042a2
|
Fix ugly code
|
2023-07-19 01:44:20 +02:00 |
|
avogar
|
67f340b501
|
Merge branch 'master' of github.com:ClickHouse/ClickHouse into structure-to-schema
|
2023-07-18 13:52:15 +00:00 |
|
Kruglov Pavel
|
64e88cde21
|
Merge branch 'master' into better-progress-bar-2
|
2023-07-18 13:37:53 +02:00 |
|
Kruglov Pavel
|
1e616e17ab
|
Merge branch 'master' into row-binary-with-defaults
|
2023-07-17 19:13:57 +02:00 |
|
Kruglov Pavel
|
1dd05319b5
|
Merge branch 'master' into formats-with-subcolumns
|
2023-07-17 19:13:42 +02:00 |
|
kevinyhzou
|
355faa4251
|
ci fix
|
2023-07-17 20:08:32 +08:00 |
|
flynn
|
d6709ded53
|
Merge branch 'master' into avro-fix
|
2023-07-17 14:51:34 +08:00 |
|
taiyang-li
|
8ea335aca7
|
update style
|
2023-07-17 10:43:13 +08:00 |
|
taiyang-li
|
7716479a37
|
add comments for https://github.com/ClickHouse/ClickHouse/pull/52112
|
2023-07-17 10:33:38 +08:00 |
|
flynn
|
386adfad33
|
Avro input format support Union with single type
|
2023-07-15 16:21:58 +00:00 |
|
taiyang-li
|
8ea3bf4ade
|
improve ch to arrow
|
2023-07-14 16:09:22 +08:00 |
|
kevinyhzou
|
c6b8097090
|
rebase main
|
2023-07-14 11:24:38 +08:00 |
|
kevinyhzou
|
b2665031dc
|
review fix
|
2023-07-13 20:27:14 +08:00 |
|
kevinyhzou
|
ba57c84db3
|
bug fix csv input field type mismatch
|
2023-07-13 20:24:10 +08:00 |
|
Dmitry Kardymon
|
385a210fee
|
Merge remote-tracking branch 'origin/master' into ADQM-870
|
2023-07-10 13:19:21 +00:00 |
|
Alexey Milovidov
|
3d4800995f
|
Merge pull request #49732 from nickitat/impr_prefetch
Improve reading with prefetch
|
2023-07-09 06:10:58 +03:00 |
|
Kruglov Pavel
|
06de25451a
|
Merge branch 'master' into formats-with-subcolumns
|
2023-07-06 16:21:52 +02:00 |
|
avogar
|
810d1ee069
|
Fix tests
|
2023-07-06 13:48:57 +00:00 |
|
Nikita Taranov
|
aec7205636
|
rework pool usage
|
2023-07-06 14:41:09 +02:00 |
|
Dmitry Kardymon
|
86fc702236
|
Add skipWhitespacesAndTabs()
Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
|
2023-07-06 15:14:18 +03:00 |
|
Dmitry Kardymon
|
32f5a78302
|
Fix setting name
|
2023-07-06 07:32:46 +00:00 |
|
Dmitry Kardymon
|
24b5c9c204
|
Use one setting input_format_csv_allow_variable_number_of_colums and code in RowInput
|
2023-07-06 06:05:43 +00:00 |
|
avogar
|
d11cd0dc30
|
Fix tests
|
2023-07-05 17:56:03 +00:00 |
|
Dmitry Kardymon
|
86014a60a3
|
Fixed case with spaces before delimiter
|
2023-07-05 11:42:02 +00:00 |
|
avogar
|
98aa6b317f
|
Support reading subcolumns from file/s3/hdfs/url/azureBlobStorage table functions
|
2023-07-04 21:17:26 +00:00 |
|
Robert Schulze
|
fe49e98455
|
Follow-up to re2 update 2023-06-02 (#50949)
|
2023-07-03 08:28:25 +00:00 |
|
avogar
|
34bf0284ad
|
Add RowBinaryWithDefaults format
|
2023-06-30 16:18:30 +00:00 |
|
avogar
|
03f820bc4a
|
Merge branch 'master' of github.com:ClickHouse/ClickHouse into structure-to-schema
|
2023-06-22 18:46:01 +00:00 |
|
avogar
|
4060beae49
|
Structure to CapnProto/Protobuf schema take 1
|
2023-06-22 18:00:00 +00:00 |
|
Dmitry Kardymon
|
19d0214ac1
|
Merge remote-tracking branch 'origin/master' into ADQM-870
|
2023-06-22 13:02:31 +00:00 |
|
Dmitry Kardymon
|
a0fde6a55b
|
Style fix
|
2023-06-22 10:50:14 +00:00 |
|
Dmitry Kardymon
|
2c3a4cb90d
|
Style fix
|
2023-06-22 10:47:07 +00:00 |
|
Sema Checherinda
|
01de36f1fa
|
Merge pull request #50395 from CheSema/better-log
require `finalize()` call before d-tor for all writes buffers
|
2023-06-21 21:12:02 +02:00 |
|
Dmitry Kardymon
|
fff0c8da92
|
Merge remote-tracking branch 'origin/master' into ADQM-870
|
2023-06-21 10:56:50 +00:00 |
|
Kruglov Pavel
|
8f8cd97fd8
|
Merge pull request #51088 from Avogar/better-progress-bar
Improve progress bar for file/s3/hdfs/url table functions. Step 1
|
2023-06-21 12:42:25 +02:00 |
|
Sema Checherinda
|
9b0c3359cf
|
Merge branch 'master' into better-log
|
2023-06-20 20:37:36 +02:00 |
|
Sema Checherinda
|
fd292dc730
|
work with comment on the PR
|
2023-06-20 20:02:04 +02:00 |
|
Kruglov Pavel
|
0edfbb45ad
|
Merge pull request #50873 from Avogar/parquet-big-integers
Fallback to parsing big integer from String instead of exception in Parquet format
|
2023-06-20 16:10:46 +02:00 |
|
avogar
|
d492acbcd2
|
Fix tests
|
2023-06-19 13:36:29 +00:00 |
|
Dmitry Kardymon
|
f81401db99
|
Add empty line test
|
2023-06-19 10:48:38 +00:00 |
|
Dmitry Kardymon
|
dd43a186ad
|
Minor edit docs / add int256 test
|
2023-06-19 09:51:29 +00:00 |
|
Dmitry Kardymon
|
30bea857fd
|
Merge remote-tracking branch 'origin/master' into ADQM-870
|
2023-06-19 07:19:07 +00:00 |
|
avogar
|
3209ebe34b
|
Improve progress bar for file/s3/hdfs/url table functions. Step 1
|
2023-06-16 15:51:18 +00:00 |
|
Sema Checherinda
|
1cb02e2710
|
do call finalize for all buffers
|
2023-06-16 16:38:18 +02:00 |
|
Dmitry Kardymon
|
0eeee11dc4
|
Style fix, add comment
|
2023-06-15 12:36:18 +00:00 |
|
Dmitry Kardymon
|
806176d88e
|
Add input_format_csv_missing_as_default setting and tests
|
2023-06-15 11:23:08 +00:00 |
|
KevinyhZou
|
953f40aa3b
|
Merge branch 'master' into bug_fix_csv_parse_by_tab_delimiter
|
2023-06-15 10:25:19 +08:00 |
|
Dmitry Kardymon
|
a91fc3ddb3
|
Add docs/ add more cases in test
|
2023-06-14 16:44:31 +00:00 |
|
Dmitry Kardymon
|
ed318d1035
|
Add input_format_csv_ignore_extra_columns setting (prototype)
|
2023-06-14 10:35:36 +00:00 |
|
kevinyhzou
|
f3b99156ac
|
review fix
|
2023-06-14 10:48:21 +08:00 |
|
Kruglov Pavel
|
607f337d67
|
Merge pull request #50592 from Avogar/max-bytes-to-read-in-schema-inference
Add setting to limit the number of bytes to read in schema inference
|
2023-06-13 16:47:57 +02:00 |
|
Kruglov Pavel
|
8fdcd91c38
|
Merge pull request #49752 from Avogar/better-capnproto-3
Refactor CapnProto format to improve input/output performance
|
2023-06-13 16:20:38 +02:00 |
|
Kruglov Pavel
|
edd47a2281
|
Merge branch 'master' into skip-trailing-empty-lines
|
2023-06-12 13:57:15 +02:00 |
|
Kruglov Pavel
|
e03cd725b0
|
Merge pull request #50602 from Avogar/null-as-default-schema-inference
Respect setting input_format_as_default in schema inference
|
2023-06-12 13:45:52 +02:00 |
|
Kruglov Pavel
|
da68980b8d
|
Merge branch 'master' into max-bytes-to-read-in-schema-inference
|
2023-06-12 13:45:31 +02:00 |
|
avogar
|
5cec4c3161
|
Fallback to parsing big integer from String instead of exception in Parquet format
|
2023-06-12 11:34:40 +00:00 |
|
kevinyhzou
|
911f8ad8dc
|
use whitespace or tab as field delimiter
|
2023-06-12 11:57:52 +08:00 |
|
Hongbin Ma
|
41c34aaf5e
|
optimize parquet write performance for parallel threads
fix CI
fix review comments and CI
|
2023-06-09 19:09:58 -07:00 |
|
kevinyhzou
|
48e1b21aab
|
Add feature to support read csv by space & tab delimiter
|
2023-06-08 20:34:30 +08:00 |
|
avogar
|
cc036528fe
|
Merge branch 'master' of github.com:ClickHouse/ClickHouse into better-capnproto-3
|
2023-06-08 11:16:13 +00:00 |
|
Kruglov Pavel
|
a714c1662e
|
Merge branch 'master' into max-bytes-to-read-in-schema-inference
|
2023-06-08 12:55:31 +02:00 |
|
Kruglov Pavel
|
4727c85e1f
|
Merge branch 'master' into null-as-default-schema-inference
|
2023-06-08 12:54:18 +02:00 |
|
Kruglov Pavel
|
dc24599525
|
Merge branch 'master' into skip-trailing-empty-lines
|
2023-06-08 12:39:23 +02:00 |
|
avogar
|
cf947e6e01
|
Fix typo
|
2023-06-07 12:50:16 +00:00 |
|
avogar
|
87ac6b8b63
|
Fix reading negative decimals in avro format
|
2023-06-07 12:49:28 +00:00 |
|
Kruglov Pavel
|
5af1819143
|
Merge pull request #50586 from Avogar/better-avro-decimal
Better support for avro decimals
|
2023-06-06 19:40:59 +02:00 |
|
Kruglov Pavel
|
1baa6404e6
|
Merge branch 'master' into skip-trailing-empty-lines
|
2023-06-06 19:39:34 +02:00 |
|
avogar
|
df50833b70
|
Allow to skip trailing empty lines in CSV/TSV/CustomeSeparated formats
|
2023-06-06 17:33:05 +00:00 |
|
Kruglov Pavel
|
af880a6f3b
|
Merge branch 'master' into max-bytes-to-read-in-schema-inference
|
2023-06-06 14:47:58 +02:00 |
|
Robert Schulze
|
2e16b497f5
|
Merge pull request #50519 from ClibMouse/feature/uuid-serialization
Implement endianness-independent serialization for UUID
|
2023-06-06 09:18:19 +02:00 |
|
avogar
|
67af505ed6
|
Respect setting input_format_as_default in schema inference
|
2023-06-05 17:04:55 +00:00 |
|
avogar
|
33e51d4f3b
|
Add setting to limit the number of bytes to read in schema inference
|
2023-06-05 15:22:04 +00:00 |
|
ltrk2
|
3938309374
|
Implement review comments
|
2023-06-05 08:18:03 -07:00 |
|
avogar
|
aa20935cb9
|
Better
|
2023-06-05 12:45:14 +00:00 |
|
avogar
|
55345d5a25
|
Fix exception message
|
2023-06-05 12:43:38 +00:00 |
|
avogar
|
4f0adf5f61
|
Better support for avro decimals
|
2023-06-05 12:40:54 +00:00 |
|
Alexey Gerasimchuk
|
9958731c27
|
Merge branch 'master' into ADQM-830
|
2023-06-05 07:46:47 +10:00 |
|
ltrk2
|
50654435dc
|
Implement endianness-independent serialization for UUID
|
2023-06-02 19:36:37 +00:00 |
|
Michael Kolupaev
|
b51064a508
|
Get rid of SeekableReadBufferFactory, add SeekableReadBuffer::readBigAt() instead
|
2023-06-01 18:48:30 -07:00 |
|
Kruglov Pavel
|
898d1f34db
|
Merge branch 'master' into better-capnproto-3
|
2023-05-31 21:44:00 +02:00 |
|
Alexey Gerasimchuk
|
44ba35d2c1
|
Merge branch 'master' into ADQM-830
|
2023-05-31 15:07:37 +10:00 |
|
Michael Kolupaev
|
536c4a99c8
|
Fix clickhouse-local crashing when writing empty Arrow or Parquet output
|
2023-05-30 10:45:49 -07:00 |
|
Alexey Milovidov
|
1875a93328
|
Merge pull request #50224 from Avogar/fix-custom-separated-ignore-spaces
Fix skipping spaces at end of row in CustomSeparatedIgnoreSpaces format
|
2023-05-29 02:42:38 +03:00 |
|
Alexey Gerasimchuck
|
01f3a46cf0
|
fixed wrong case in removeNullable
|
2023-05-25 22:49:36 +00:00 |
|
Alexey Gerasimchuk
|
613568423d
|
Update src/Processors/Formats/Impl/CSVRowInputFormat.cpp
Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
|
2023-05-26 07:49:45 +10:00 |
|
avogar
|
ce99825200
|
Fix skipping spaces at end of row in CustomSeparatedIgnoreSpaces format
|
2023-05-25 11:19:15 +00:00 |
|
Alexey Gerasimchuck
|
75791d7a63
|
Added input_format_csv_trim_whitespaces parameter
|
2023-05-25 07:51:32 +00:00 |
|
Kruglov Pavel
|
1347dc4ede
|
Fix style
|
2023-05-24 17:19:04 +00:00 |
|
Kruglov Pavel
|
cc7cfa050f
|
Fix style
|
2023-05-24 17:19:04 +00:00 |
|
avogar
|
e66f6272d1
|
Refactor CapnProto format to improve input/output performance
|
2023-05-24 17:19:04 +00:00 |
|
avogar
|
bf19765c9b
|
Fix possible use-of-uninitialized-value
|
2023-05-22 19:34:19 +00:00 |
|
Michael Kolupaev
|
6fd5d8e8ba
|
Add setting output_format_parquet_compliant_nested_types to produce more compatible Parquet files
|
2023-05-19 18:39:50 +00:00 |
|
Kruglov Pavel
|
558eda4146
|
Merge pull request #49412 from azat/block-use-dense-hash-map
Switch Block::NameMap to google::dense_hash_map over HashMap
|
2023-05-15 12:22:55 +02:00 |
|
Alexey Milovidov
|
f6144ee32b
|
Revert "Make Pretty formats even prettier."
|
2023-05-13 02:45:07 +03:00 |
|
Azat Khuzhin
|
2c40dd6a4c
|
Switch Block::NameMap to google::dense_hash_map over HashMap
Since HashMap creates 2^8 elements by default, while dense_hash_map
should be good here.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
|
2023-05-12 05:52:57 +02:00 |
|
Robert Schulze
|
9db78792d0
|
Fix MsgPackRowInputFormat.cpp build
|
2023-05-11 10:00:32 +00:00 |
|
Azat Khuzhin
|
d8dd50a9c6
|
Fix misc-misplaced-const clang-tidy warning
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
|
2023-05-09 21:27:21 +02:00 |
|
Azat Khuzhin
|
d03ae2abfa
|
Fix modernize-loop-convert clang-tidy warning
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
|
2023-05-09 21:19:37 +02:00 |
|
Azat Khuzhin
|
833652b7c9
|
Revert "Suppress clang-analyzer-cplusplus.NewDelete in MsgPackRowInputFormat"
Let's try to revert this quirk during upgrading to clang 16.
This reverts commit c1e70169d2 .
|
2023-05-09 20:36:39 +02:00 |
|
Alexey Milovidov
|
90b0de5677
|
Make Pretty prettier
|
2023-05-05 06:36:53 +02:00 |
|
Alexey Milovidov
|
179eddee01
|
Remove garbage from Pretty format
|
2023-05-05 04:44:47 +02:00 |
|
Michael Kolupaev
|
eb3b774ad0
|
Better control over Parquet row group size
|
2023-05-04 14:59:55 -07:00 |
|
Raúl Marín
|
f0e045bb3d
|
Merge remote-tracking branch 'blessed/master' into arenita
|
2023-04-24 10:42:56 +02:00 |
|
Alexey Milovidov
|
b08f6b9dcc
|
Update LineAsStringRowInputFormat.cpp
|
2023-04-23 08:32:58 +03:00 |
|
Alexey Milovidov
|
54d10f87f2
|
Consistency of the LineAsString format
|
2023-04-23 05:50:46 +02:00 |
|
Alexander Gololobov
|
e6d34a9f8b
|
Merge pull request #48987 from Avogar/avoid-logical-error
Don't throw logical error when column is not found in Parquet/Arrow schema
|
2023-04-21 09:46:16 +02:00 |
|
robot-ch-test-poll1
|
f466c89621
|
Merge pull request #48911 from Avogar/parquet-metadata-format
Add ParquetMetadata input format to read Parquet file metadata
|
2023-04-21 03:46:26 +02:00 |
|
avogar
|
8a3e813ecd
|
Don't throw logical error when column is not found in Parquet/Arrow schema
|
2023-04-20 19:09:40 +00:00 |
|
Kruglov Pavel
|
9bc95bed85
|
Merge pull request #48898 from Avogar/pretty-json
Add PrettyJSONEachRow format to output pretty JSON
|
2023-04-19 12:27:24 +02:00 |
|
Kruglov Pavel
|
8053b18c05
|
Merge pull request #48361 from Avogar/fix-arrow-dict-2
Fix serializing LowCardinality as Arrow dictionary
|
2023-04-19 12:23:27 +02:00 |
|
Kruglov Pavel
|
21dddf8c4c
|
Merge pull request #48864 from Avogar/fix-parquet-date32
Fix reading Date32 Parquet/Arrow column into not Date32 column
|
2023-04-19 09:16:42 +02:00 |
|
avogar
|
0878ab8443
|
Fix build
|
2023-04-18 19:51:53 +00:00 |
|
Kruglov Pavel
|
a5c52d3bc3
|
Merge branch 'master' into parquet-metadata-format
|
2023-04-18 21:51:14 +02:00 |
|
avogar
|
7a67951f64
|
Add more fields, fix style
|
2023-04-18 17:59:01 +00:00 |
|
avogar
|
b0e5f7069e
|
Update exception message
|
2023-04-18 17:15:16 +00:00 |
|
avogar
|
c5efa4dc01
|
Add comment
|
2023-04-18 17:10:37 +00:00 |
|
avogar
|
f7f609dfb9
|
Better
|
2023-04-18 16:57:55 +00:00 |
|
avogar
|
b277a5c943
|
Add ParquetMetadata input format to read Parquet file metadata
|
2023-04-18 16:46:26 +00:00 |
|
Kruglov Pavel
|
8710c15c85
|
Apply suggestion
|
2023-04-18 18:25:54 +02:00 |
|
avogar
|
e356f92b77
|
Add PrettyJSONEachRow format to output pretty JSON
|
2023-04-18 13:28:59 +00:00 |
|
Kruglov Pavel
|
3bbc347901
|
Fix build
|
2023-04-17 22:22:26 +02:00 |
|
Kruglov Pavel
|
be0b0e7921
|
Fix build
|
2023-04-17 20:58:19 +02:00 |
|
avogar
|
527572e7bd
|
Fix reading Date32 Parquet/Arrow column into not Date32 column
|
2023-04-17 16:51:22 +00:00 |
|
Kruglov Pavel
|
5c9b404c6e
|
Update src/Processors/Formats/Impl/CHColumnToArrowColumn.cpp
Co-authored-by: Yakov Olkhovskiy <99031427+yakov-olkhovskiy@users.noreply.github.com>
|
2023-04-17 14:02:07 +02:00 |
|
Raúl Marín
|
39f8c43a60
|
Merge remote-tracking branch 'blessed/master' into arenita
|
2023-04-17 10:33:38 +02:00 |
|
Michael Kolupaev
|
87be78e6de
|
Better
|
2023-04-17 04:58:32 +00:00 |
|
Michael Kolupaev
|
e133633359
|
Parallel decoding with one row group per thread
|
2023-04-17 04:58:32 +00:00 |
|
Michael Kolupaev
|
2d4fe85513
|
Something
|
2023-04-17 04:58:32 +00:00 |
|
Michael Kolupaev
|
dc6e34075e
|
Read less unnecessary data from Parquet files
|
2023-04-17 04:58:32 +00:00 |
|
Dmitry Novik
|
5cc9b46f78
|
Merge remote-tracking branch 'origin/master' into optimize-compilation
|
2023-04-13 16:04:09 +02:00 |
|
Raúl Marín
|
da9a539cf7
|
Reduce the usage of Arena.h
|
2023-04-13 10:31:32 +02:00 |
|