Commit Graph

63 Commits

Author SHA1 Message Date
Kruglov Pavel
f539fb835d
Merge branch 'master' into formats-with-names 2022-05-23 12:14:20 +02:00
avogar
68bb07d166 Better naming 2022-05-13 18:39:19 +00:00
avogar
b17fec659a Improve performance and memory usage for select of subset of columns for some formats 2022-05-13 13:51:28 +00:00
avogar
62a7ba3f26 Add columnar JSON formats 2022-05-06 16:48:48 +00:00
avogar
42726639f3 Check ORC/Parquet/Arrow format magic bytes before loading file in memory 2022-04-13 19:27:38 +00:00
avogar
d2017a63b1 Merge branch 'master' of github.com:ClickHouse/ClickHouse into improve-schema-inference 2022-04-07 11:36:40 +00:00
taiyang-li
acb9f1632e suppoort skip splits in orc and parquet 2022-04-06 16:40:22 +08:00
Kruglov Pavel
d45143ffe0
Merge branch 'master' into improve-schema-inference 2022-03-25 12:05:40 +01:00
avogar
557edbd172 Add some improvements and fixes in schema inference 2022-03-24 12:54:12 +00:00
Antonio Andelic
052057f2ef Address PR comments 2022-03-23 15:42:46 +00:00
Antonio Andelic
d73c906e68 Format code 2022-03-21 07:50:17 +00:00
Antonio Andelic
f75b054255 Allow case insensitive column matching 2022-03-21 07:47:37 +00:00
Antonio Andelic
607f785e48 Revert "Merge pull request #35145 from bigo-sg/lower-column-name"
This reverts commit ebf72bf61d, reversing
changes made to f1b812bdc1.
2022-03-17 12:31:43 +00:00
shuchaome
7a3623d216 fix bug 2022-03-11 17:26:13 +08:00
shuchaome
46cb4483a6 Optimise by lowering schema on the beginning. Add a functional test. 2022-03-11 14:34:46 +08:00
shuchaome
b7cd85df6b remove unused column_names in ORCBlockInputFormat 2022-03-09 18:16:22 +08:00
shuchaome
bb50133424
Apply suggestions from code review
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2022-03-09 17:32:27 +08:00
shuchaome
56795b831d add setting to lower column case when reading parquet/orc file 2022-03-09 16:07:02 +08:00
Maksim Kita
b1a956c5f1 clang-tidy check performance-move-const-arg fix 2022-03-02 18:15:27 +00:00
taiyang-li
1e102bc1b2 merge master 2022-01-01 09:01:06 +08:00
avogar
26abf7aa62 Remove code duplication, use simdjson and rapidjson instead of Poco 2021-12-29 12:21:01 +03:00
avogar
8112a71233 Implement schema inference for most input formats 2021-12-29 12:18:56 +03:00
kssenii
1f6ca619b7 Allow some killing 2021-12-27 22:42:56 +03:00
taiyang-li
9036b18c2f merge master 2021-12-27 15:12:48 +08:00
taiyang-li
2597925724 merge master 2021-12-21 15:55:39 +08:00
Raúl Marín
b553e51969 Merge remote-tracking branch 'blessed/master' into libcxx13_take2 2021-12-20 17:47:57 +01:00
kreuzerkrieg
f06c37d206 Stop reading incomplete stripes and skip rows. 2021-12-19 18:41:32 +02:00
Raúl Marín
61d959df8f Fix arrow build 2021-12-13 16:49:22 +01:00
Raúl Marín
d9e4544239 Adapt to arrow 6 2021-12-13 16:49:21 +01:00
taiyang-li
e2d1ed1568 fix error 2021-12-02 20:51:19 +08:00
taiyang-li
9ec8272186 refactor hive text input format 2021-12-02 16:14:25 +08:00
taiyang-li
440fa9b69c implement getMissingValues for ORC/Parquet/Arrow 2021-11-30 15:44:59 +08:00
taiyang-li
cacf516e3e calculate column value by default expression & apply defaults_for_omitted_fields_ in ArrowColumnToCHColumn 2021-11-30 14:52:26 +08:00
kssenii
f4ffedd5f3 Better 2021-11-15 10:23:35 +03:00
kssenii
45ea820297 Reduce memory usage for some formats 2021-11-03 14:30:03 +03:00
Nikolai Kochetov
ec18340351 Remove streams from formats. 2021-10-11 19:11:50 +03:00
Anton Popov
e8ac8e3454 execute asynchronous inserts separatly for each client 2021-08-27 06:00:12 +03:00
Pavel Kruglov
e4c5d7e3b1 Support inserting nested as Array of structs, add some refactoring 2021-08-05 14:10:27 +03:00
Pavel Kruglov
931e05ab04 Minor refactoring 2021-06-15 16:15:27 +03:00
Pavel Kruglov
a4ef60e230 Remove Impl including from .h file 2021-06-15 16:15:27 +03:00
Pavel Kruglov
b120841b57 Small changes 2021-06-15 16:15:27 +03:00
Pavel Kruglov
a4decd0848 Support Map type, fix and add tests 2021-06-15 16:15:27 +03:00
Pavel Kruglov
c8b37977da Fix bugs, support dictionary for Arrow format 2021-06-15 16:15:27 +03:00
Pavel Kruglov
540c494492 Fix selecting indexes in ORC and Parquet formats 2021-06-15 16:15:27 +03:00
Pavel Kruglov
235e3e2f5b Support structs in Arrow/Parquet/ORC 2021-06-15 16:15:27 +03:00
Pavel Kruglov
484cac6193 Return include_indices, fix arrays and add more tests 2021-05-14 13:50:10 +03:00
Pavel Kruglov
46a7cc5f1d Remove include_indices 2021-05-14 00:52:29 +03:00
Pavel Kruglov
25ceb1df65 Final fixes 2021-05-13 20:00:13 +03:00
Chao Ma
6238028225 Fix integration test test_storage_kafka failed error 2021-04-19 01:38:29 +00:00
Chao Ma
8d7a53efda Fix test 00163_column_oriented_formats failed error 2021-04-19 01:38:29 +00:00