Commit Graph

53 Commits

Author SHA1 Message Date
Antonio Andelic
d73c906e68 Format code 2022-03-21 07:50:17 +00:00
Antonio Andelic
f75b054255 Allow case insensitive column matching 2022-03-21 07:47:37 +00:00
Antonio Andelic
607f785e48 Revert "Merge pull request #35145 from bigo-sg/lower-column-name"
This reverts commit ebf72bf61d, reversing
changes made to f1b812bdc1.
2022-03-17 12:31:43 +00:00
shuchaome
7a3623d216 fix bug 2022-03-11 17:26:13 +08:00
shuchaome
46cb4483a6 Optimise by lowering schema on the beginning. Add a functional test. 2022-03-11 14:34:46 +08:00
shuchaome
b7cd85df6b remove unused column_names in ORCBlockInputFormat 2022-03-09 18:16:22 +08:00
shuchaome
bb50133424
Apply suggestions from code review
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2022-03-09 17:32:27 +08:00
shuchaome
56795b831d add setting to lower column case when reading parquet/orc file 2022-03-09 16:07:02 +08:00
Maksim Kita
b1a956c5f1 clang-tidy check performance-move-const-arg fix 2022-03-02 18:15:27 +00:00
taiyang-li
1e102bc1b2 merge master 2022-01-01 09:01:06 +08:00
avogar
26abf7aa62 Remove code duplication, use simdjson and rapidjson instead of Poco 2021-12-29 12:21:01 +03:00
avogar
8112a71233 Implement schema inference for most input formats 2021-12-29 12:18:56 +03:00
kssenii
1f6ca619b7 Allow some killing 2021-12-27 22:42:56 +03:00
taiyang-li
9036b18c2f merge master 2021-12-27 15:12:48 +08:00
taiyang-li
2597925724 merge master 2021-12-21 15:55:39 +08:00
Raúl Marín
b553e51969 Merge remote-tracking branch 'blessed/master' into libcxx13_take2 2021-12-20 17:47:57 +01:00
kreuzerkrieg
f06c37d206 Stop reading incomplete stripes and skip rows. 2021-12-19 18:41:32 +02:00
Raúl Marín
61d959df8f Fix arrow build 2021-12-13 16:49:22 +01:00
Raúl Marín
d9e4544239 Adapt to arrow 6 2021-12-13 16:49:21 +01:00
taiyang-li
e2d1ed1568 fix error 2021-12-02 20:51:19 +08:00
taiyang-li
9ec8272186 refactor hive text input format 2021-12-02 16:14:25 +08:00
taiyang-li
440fa9b69c implement getMissingValues for ORC/Parquet/Arrow 2021-11-30 15:44:59 +08:00
taiyang-li
cacf516e3e calculate column value by default expression & apply defaults_for_omitted_fields_ in ArrowColumnToCHColumn 2021-11-30 14:52:26 +08:00
kssenii
f4ffedd5f3 Better 2021-11-15 10:23:35 +03:00
kssenii
45ea820297 Reduce memory usage for some formats 2021-11-03 14:30:03 +03:00
Nikolai Kochetov
ec18340351 Remove streams from formats. 2021-10-11 19:11:50 +03:00
Anton Popov
e8ac8e3454 execute asynchronous inserts separatly for each client 2021-08-27 06:00:12 +03:00
Pavel Kruglov
e4c5d7e3b1 Support inserting nested as Array of structs, add some refactoring 2021-08-05 14:10:27 +03:00
Pavel Kruglov
931e05ab04 Minor refactoring 2021-06-15 16:15:27 +03:00
Pavel Kruglov
a4ef60e230 Remove Impl including from .h file 2021-06-15 16:15:27 +03:00
Pavel Kruglov
b120841b57 Small changes 2021-06-15 16:15:27 +03:00
Pavel Kruglov
a4decd0848 Support Map type, fix and add tests 2021-06-15 16:15:27 +03:00
Pavel Kruglov
c8b37977da Fix bugs, support dictionary for Arrow format 2021-06-15 16:15:27 +03:00
Pavel Kruglov
540c494492 Fix selecting indexes in ORC and Parquet formats 2021-06-15 16:15:27 +03:00
Pavel Kruglov
235e3e2f5b Support structs in Arrow/Parquet/ORC 2021-06-15 16:15:27 +03:00
Pavel Kruglov
484cac6193 Return include_indices, fix arrays and add more tests 2021-05-14 13:50:10 +03:00
Pavel Kruglov
46a7cc5f1d Remove include_indices 2021-05-14 00:52:29 +03:00
Pavel Kruglov
25ceb1df65 Final fixes 2021-05-13 20:00:13 +03:00
Chao Ma
6238028225 Fix integration test test_storage_kafka failed error 2021-04-19 01:38:29 +00:00
Chao Ma
8d7a53efda Fix test 00163_column_oriented_formats failed error 2021-04-19 01:38:29 +00:00
Chao Ma
2776772ac6 Read ORC file by stripe to reduce memory cost 2021-04-19 01:38:29 +00:00
Nikita Mikhailov
37f48d13b4 add test 2021-04-06 22:23:16 +03:00
Nikolai Kochetov
76495124cd Fix readign from fd for ORCBlockInputFormat. 2021-01-15 18:45:29 +03:00
nikitamikhaylov
c60c161168 add ParsingException 2020-12-23 01:02:01 +03:00
FawnD2
17450811d4 Move getHeader at the beginning of generate() 2020-05-04 16:19:25 +03:00
FawnD2
02e12215e7 Apply reducing memory usage optimization for seekable files to ORC format 2020-05-04 03:52:28 +03:00
FawnD2
a554177724 Simplify ORC format 2020-05-04 02:23:20 +03:00
FawnD2
112758b99d Merge branch 'master' into arrow-io-format 2020-05-04 00:53:17 +03:00
FawnD2
7cc7a87f9f Simplify interfaces 2020-05-03 21:12:14 +03:00
Alexey Milovidov
e6ab4d655b Fix bad code 2020-05-02 22:54:29 +03:00