Commit Graph

837 Commits

Author SHA1 Message Date
lgbo-ustc
5c71d3687a fixed some bugs
1. interagtion test for test_hive_query failed
2. nullptr reference in arrowSchemaToCHHeader
2022-01-12 17:01:05 +08:00
taiyang-li
66813a3aa9 merge master 2022-01-12 16:56:29 +08:00
avogar
9915ce7ded Fix segfault in arrowSchemaToCHHeader 2022-01-11 20:30:35 +03:00
avogar
0ae0aa712b Don't print exception twice in client in case of exception in parallel parsing 2022-01-11 18:37:07 +03:00
李扬
2df2442ad0
Merge branch 'master' into hive_table 2022-01-04 01:26:16 -06:00
taiyang-li
8730dda895 fix hivte text 2022-01-01 09:16:30 +08:00
taiyang-li
1e102bc1b2 merge master 2022-01-01 09:01:06 +08:00
alexey-milovidov
34b934a1e0
Merge pull request #33331 from ClickHouse/serxa/line-as-string-output-format
Add LineAsString output format
2021-12-31 14:38:36 +03:00
Sergei Trifonov
f1d398ae4b Add LineAsString output format 2021-12-30 20:38:54 +03:00
avogar
97788b9c21 Allow to create new files on insert for File/S3/HDFS engines 2021-12-29 21:19:13 +03:00
avogar
364b4f5d36 Fix special build 2021-12-29 12:21:01 +03:00
Kruglov Pavel
cb0ed7fcb7 Fix typo 2021-12-29 12:21:01 +03:00
avogar
26abf7aa62 Remove code duplication, use simdjson and rapidjson instead of Poco 2021-12-29 12:21:01 +03:00
avogar
74f09d6476 Fix tests 2021-12-29 12:18:56 +03:00
avogar
aaf9f85c67 Add more tests and fixes 2021-12-29 12:18:56 +03:00
avogar
dd994aa761 Add some tests and some fixes 2021-12-29 12:18:56 +03:00
avogar
8112a71233 Implement schema inference for most input formats 2021-12-29 12:18:56 +03:00
kssenii
1f6ca619b7 Allow some killing 2021-12-27 22:42:56 +03:00
taiyang-li
9036b18c2f merge master 2021-12-27 15:12:48 +08:00
alexey-milovidov
c583ea7e6b
Merge pull request #32484 from Algunenano/libcxx13_take2
libc++ 13 compatibility
2021-12-25 10:14:12 +03:00
Andrii Buriachevskyi
e8cc6df7bb Add support of DEFAULT keyword for INSERT 2021-12-24 13:10:19 +01:00
Alexey Milovidov
29d28c531f Move code around to avoid dlsym on Musl 2021-12-24 12:25:27 +03:00
Raúl Marín
77db850c0b Merge remote-tracking branch 'blessed/master' into libcxx13_take2 2021-12-23 12:42:39 +01:00
Raúl Marín
88b8fd8b60 Merge remote-tracking branch 'blessed/master' into libcxx13_take2 2021-12-23 09:16:19 +01:00
Alexey Milovidov
f37ff32c37 Whitespaces 2021-12-23 01:33:47 +03:00
mreddy017
3e50217501 Remove the additional white space as per the pipeline build error. 2021-12-23 01:30:56 +03:00
mreddy017
10eb2dbdb7 Addressing review comments 2021-12-23 01:30:56 +03:00
Harry-Lee
846c46ac4b Fix issue #80: union index out of boundary 2021-12-23 01:30:56 +03:00
taiyang-li
2597925724 merge master 2021-12-21 15:55:39 +08:00
Maksim Kita
dd0d3de050
Merge pull request #32970 from kitaisreal/loops-remove-postfix-increment
Loops remove postfix increment
2021-12-20 19:51:07 +03:00
Raúl Marín
b553e51969 Merge remote-tracking branch 'blessed/master' into libcxx13_take2 2021-12-20 17:47:57 +01:00
Maksim Kita
51477adf1b Updated additional cases 2021-12-20 15:55:07 +03:00
kreuzerkrieg
f06c37d206 Stop reading incomplete stripes and skip rows. 2021-12-19 18:41:32 +02:00
alesapin
6bd7e425c6
Merge pull request #22535 from CurtizJ/sparse-serialization
Sparse serialization and ColumnSparse
2021-12-17 15:26:17 +03:00
taiyang-li
d033fc4c24 merge master and fix conflict 2021-12-17 15:11:21 +08:00
Dmitrii Mokhnatkin
2147658432
Proper handler for apache arrow column duplication 2021-12-15 18:30:32 +03:00
Raúl Marín
3de002c7c9 Merge remote-tracking branch 'blessed/master' into libcxx13_take2 2021-12-15 12:57:57 +01:00
alesapin
d7663b2179 Merge branch 'master' into fix_special_build_check 2021-12-14 19:08:28 +03:00
alesapin
884801e1bd Fixing 2021-12-14 19:08:08 +03:00
Anton Popov
16312e7e4a Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-14 18:58:17 +03:00
Raúl Marín
44f3b1c9d2 Merge remote-tracking branch 'blessed/master' into libcxx13_take2 2021-12-14 13:05:01 +01:00
Anton Popov
bda0cc2f76
Merge pull request #32530 from Avogar/fix-async-inserts
Fix async inserts for some input formats
2021-12-14 14:07:05 +03:00
taiyang-li
ca3f7425a4 fix code 2021-12-14 17:37:31 +08:00
taiyang-li
8234d1176f merge master 2021-12-14 10:39:21 +08:00
Raúl Marín
61d959df8f Fix arrow build 2021-12-13 16:49:22 +01:00
Raúl Marín
d9e4544239 Adapt to arrow 6 2021-12-13 16:49:21 +01:00
Alexey Milovidov
71926a3a97 Fix surprisingly bad code in function "file" 2021-12-13 07:57:54 +03:00
李扬
8675086104
Merge branch 'master' into hive_table 2021-12-12 09:01:46 -06:00
taiyang-li
5ef68fc479 fix building 2021-12-11 15:50:59 +08:00
Kruglov Pavel
764e205d36
Fix resetParser in MsgPack format 2021-12-10 21:37:08 +03:00
avogar
1be84d80d4 Fix async inserts for some input formats 2021-12-10 20:54:08 +03:00
Anton Popov
d8367334a3 Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-08 18:26:19 +03:00
Kruglov Pavel
cc71c537bc
Merge pull request #32204 from Avogar/skip-quoted-values
Improve skiping unknown fields with Quoted escaping rule in Template/CustomSeparated formats
2021-12-06 12:28:14 +03:00
Dmitriy Dorofeev
31648d95e2
use application/x-ndjson for streaming JSON (#32223) 2021-12-06 10:49:14 +03:00
taiyang-li
c678c8101e fix some bugs 2021-12-04 16:41:35 +08:00
avogar
7549619b25 Improve skiping unknown fields with Quoted escaping rule in Template/CustomSeparated formats 2021-12-03 16:25:35 +03:00
Maksim Kita
6ec559f103
Update JSONEachRowRowOutputFormat.h 2021-12-03 12:48:28 +03:00
taiyang-li
01cac01527 modify permission of RowInputFormatWithNamesAndTypes methods 2021-12-02 20:54:49 +08:00
taiyang-li
e2d1ed1568 fix error 2021-12-02 20:51:19 +08:00
taiyang-li
2f4e7e1d4e merge master 2021-12-02 19:48:21 +08:00
taiyang-li
9ec8272186 refactor hive text input format 2021-12-02 16:14:25 +08:00
mergify[bot]
e568b16e02
Merge branch 'master' into content-type 2021-12-02 07:40:17 +00:00
frank chen
c49a7251ed returns content-type as json if possible
Signed-off-by: frank chen <frank.chen021@outlook.com>
2021-12-02 13:25:17 +08:00
Anton Popov
54f51444c0 Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-01 15:49:02 +03:00
tavplubix
b623a387af
Merge pull request #31887 from ClickHouse/fix_cannot_create_empty_part
Parse partition key value from `partition_id` when need to create part in empty partition
2021-12-01 15:38:46 +03:00
taiyang-li
4aeadf3967 fix build error 2021-12-01 14:13:48 +08:00
Nikita Mikhaylov
6c366feed7
Fix race in ParallelFormattingOutputFormat constructor (#32004) 2021-12-01 02:10:33 +03:00
taiyang-li
d213500a3e remove blank at end of line 2021-11-30 18:23:24 +08:00
taiyang-li
c6abe60bcc add new input format HiveTextRowInputFormat 2021-11-30 18:06:26 +08:00
taiyang-li
440fa9b69c implement getMissingValues for ORC/Parquet/Arrow 2021-11-30 15:44:59 +08:00
taiyang-li
cacf516e3e calculate column value by default expression & apply defaults_for_omitted_fields_ in ArrowColumnToCHColumn 2021-11-30 14:52:26 +08:00
taiyang-li
ad6ba24efd fix ArrowColumnToCHColumn 2021-11-30 10:49:57 +08:00
taiyang-li
6922f09ea3 reuse seekable read buffer with size 2021-11-29 20:19:36 +08:00
Alexander Tokmakov
2fb00172a9 try parse partition key value from partition_id 2021-11-27 15:07:08 +03:00
Kruglov Pavel
af998af710
Merge pull request #31489 from Avogar/parallel-formatting
Support parallel formatting almost for all text formats
2021-11-26 15:21:22 +03:00
taiyang-li
d35e2a1c83
Merge branch 'master' into hive_table 2021-11-26 11:44:50 +08:00
avogar
aa2da98844 Add test 2021-11-25 18:06:46 +03:00
avogar
37abab7fdb Better naming 2021-11-25 15:09:13 +03:00
Kruglov Pavel
5d1520be72
Merge pull request #31736 from Avogar/fix-json-with-progress
Fix race in JSONEachRowWithProgress output format
2021-11-25 13:58:41 +03:00
taiyang-li
72f60cceb9
Merge branch 'master' into hive_table 2021-11-25 17:33:26 +08:00
alesapin
fe7f21acf9
Merge pull request #31697 from ClickHouse/fix_31686
Fix parsing of domain data types
2021-11-25 11:31:41 +03:00
Kseniia Sumarokova
93cf66df12
Merge pull request #30936 from kssenii/seekable-read-buffers
Reduce memory usage for some formats when reading with s3/url/hdfs
2021-11-25 11:19:24 +03:00
avogar
e4ba685d15 Fix race in JSONEachRowWithProgressRowOutputFormat 2021-11-24 22:29:43 +03:00
Kruglov Pavel
758c0e1c5e
Fix build 2021-11-24 18:35:18 +03:00
avogar
f5447a5e74 Fix tests, support parallel formatting for Template format 2021-11-24 16:42:07 +03:00
Alexander Tokmakov
e5972e6f71 fix 2021-11-24 15:44:04 +03:00
taiyang-li
89dcef69d5 merge master 2021-11-24 14:38:04 +08:00
avogar
4470365fb3 Fix 2021-11-23 19:56:44 +03:00
avogar
b81d8426d3 Clean up 2021-11-23 19:56:43 +03:00
avogar
a900a26691 Support parallel formatting for all text output formats 2021-11-23 19:56:43 +03:00
lgbo
996d7125c0
Merge branch 'master' into hive_table 2021-11-23 10:19:02 +08:00
mergify[bot]
a7ba3e23a0
Merge branch 'master' into fix-write-buffers 2021-11-22 11:24:27 +00:00
Kruglov Pavel
814a36ba69
Merge pull request #31434 from Avogar/custom-with-names-and-types
Add formats CustomSeparatedWithNames/WithNamesAndTypes
2021-11-22 13:24:00 +03:00
alexey-milovidov
faae69f631
Merge pull request #31534 from aiven/kmichel-fix-json-colum-name-encoding
Fix invalid JSON in column names
2021-11-21 11:34:33 +03:00
Kruglov Pavel
d9c1a0c8ec
Merge branch 'master' into fix-write-buffers 2021-11-20 17:48:24 +03:00
Azat Khuzhin
6aa94ae032 Fix MySQLWire format (in case of multiple writes)
In case of multiple writes File() engine will set doNotWritePrefix(),
and this will avoid serializations initilization, move this to do this
always.

Fixes: #31004
2021-11-20 15:26:21 +03:00
kssenii
ff969b4605 Merge branch 'master' of github.com:ClickHouse/ClickHouse into seekable-read-buffers 2021-11-20 15:03:13 +03:00
Kevin Michel
edbeeaf6ec
Fix invalid JSON in column names
If the column name contains invalid UTF-8 sequences
and the output data types are all considered safe,
then the output will not be sanitized and the generated
JSON will be invalid.

A minimal reproduction case is :
`SELECT length('\x80') FORMAT JSONCompact`
where we auto-generate a non-UTF-8 column name with only
integer outputs, whereas :
`SELECT '\x80' FORMAT JSONCompact`
would be sanitized because the column type is String and
will trigger UTF-8 sanitization over the entire document.
2021-11-20 12:35:41 +01:00
Kruglov Pavel
fdd1f53d3a
Update CustomSeparatedRowOutputFormat.h 2021-11-19 16:52:48 +03:00
Kruglov Pavel
3070bf1e4d
Update CustomSeparatedRowOutputFormat.cpp 2021-11-19 16:52:31 +03:00