Commit Graph

1939 Commits

Author SHA1 Message Date
Kruglov Pavel
aa7c1f63ab
Merge pull request #56172 from Avogar/fix-schema-cache-for-json
Fix schema cache for fallback JSON->JSONEachRow with changed settings
2023-11-01 20:24:34 +01:00
Kruglov Pavel
e6f00d5e1d
Merge pull request #56117 from bigo-sg/fixed_struct_field_prune
Improve parquet struct fields reading
2023-11-01 15:32:50 +01:00
Kruglov Pavel
bf77ce691c
Merge pull request #55982 from yariks5s/npy_input_format
New input format Npy
2023-11-01 14:26:22 +01:00
taiyang-li
dc897215da fix failed uts tests/queries/0_stateless/02312_parquet_orc_arrow_names_tuples.sql 2023-11-01 20:42:07 +08:00
taiyang-li
24c45a4ee0 fix failed uts 2023-11-01 18:47:11 +08:00
taiyang-li
001cbe7912 fix typos 2023-11-01 16:58:25 +08:00
taiyang-li
b276587422 fix failed uts 2023-11-01 15:43:20 +08:00
lgbo-ustc
8334585eaf improve parquet struct field reading 2023-11-01 15:18:39 +08:00
yariks5s
6c4bf59021 fix suggestions and enhance tests 2023-10-31 18:10:55 +00:00
avogar
518e52473d Fix schema cache for fallback JSON->JSONEachRow with changed settings 2023-10-31 14:12:38 +00:00
Kruglov Pavel
4faa3d0294
Revert "Revert "Fix output/input of Arrow dictionary column"" 2023-10-31 12:30:45 +01:00
Alexey Milovidov
467b4d85e2
Revert "Fix output/input of Arrow dictionary column" 2023-10-31 09:28:09 +03:00
taiyang-li
c97b2c5be7 fix code style 2023-10-31 12:00:45 +08:00
taiyang-li
e5db57204d fix bugs 2023-10-31 11:57:47 +08:00
taiyang-li
b72341e1a8 Merge branch 'master' into orc_tuple_field_prune 2023-10-31 10:07:43 +08:00
Kruglov Pavel
4c2a132d96
Merge pull request #55989 from Avogar/lc-as-arrow-dict-fix
Fix output/input of Arrow dictionary column
2023-10-30 20:47:49 +01:00
Kruglov Pavel
4effc676f9
Merge pull request #56046 from Avogar/cr-in-unquoted-csv-string
Allow unquoted strings with CR in CSV format
2023-10-30 20:46:20 +01:00
yariks5s
9a2d89e3e4 removed getSize() and enhanced docs 2023-10-30 12:42:19 +00:00
taiyang-li
ad67b6c2ea allow tuple field pruning 2023-10-30 19:33:06 +08:00
Kruglov Pavel
c10a3b3838
Merge branch 'master' into lc-as-arrow-dict-fix 2023-10-30 11:20:57 +01:00
avogar
57bc4854c2 Fix 2023-10-30 10:17:49 +00:00
avogar
d1fcbc6e47 Fix fetching schema from schema registry in AvroConfluent 2023-10-30 10:17:48 +00:00
yariks5s
e14a7f066a fix typos 2023-10-28 01:46:59 +00:00
yariks5s
894724bfb3 suggested changes 2023-10-28 01:17:25 +00:00
yariks5s
23635352f1 fixed due to review 2023-10-27 15:43:03 +00:00
Kruglov Pavel
bb4b95e891
Merge branch 'master' into schema-inference-union 2023-10-27 14:53:58 +02:00
Kruglov Pavel
570b66f027
Merge branch 'master' into schema-inference-union 2023-10-26 19:26:00 +02:00
avogar
9d207bf027 Allow unquoted strings with CR in CSV format 2023-10-26 13:50:54 +00:00
zvonand
0766c73aab Rename date_time_overflow_mode -> date_time_overflow_behavior, moved it to format settings 2023-10-25 23:11:13 +02:00
zvonand
5b86e8c714 updated after review 2023-10-25 23:10:58 +02:00
zvonand
2f3695add8 Introduce setting for dt overflow exception
Added tests and docs
2023-10-25 23:10:24 +02:00
Alexey Milovidov
bb5a60dc19
Merge pull request #55893 from ClickHouse/revert-partial-result-2
Revert "Revert "Revert "Add settings for real-time updates during query execution"""
2023-10-25 22:20:28 +02:00
yariks5s
2ab1ae42c1 added docs and tests, style check 2023-10-25 10:37:05 +00:00
avogar
c080ba9d7e Fix output LowCardinality as Arrow dictionary 2023-10-24 19:49:04 +00:00
李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) (#55330)
* support orc filter push down

* update orc lib version

* replace setqueryinfo with setkeycondition

* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536

* refactor source with key condition

* fix building error

* remove std::cout

* update orc

* update orc version

* fix bugs

* improve code

* upgrade orc lib

* fix code style

* change as requested

* add performance tests for orc filter push down

* add performance tests for orc filter push down

* fix all bugs

* fix default as null issue

* add uts for null as default issues

* upgrade orc lib

* fix failed orc lib uts and fix typo

* fix failed uts

* fix failed uts

* fix ast fuzzer tests

* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html

* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm

* fix wrong performance tests

* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html

* add some comments

* add some comments

* inline range::equals and range::less

* fix data race of key condition

* trigger ci
2023-10-24 12:08:17 -07:00
yariks5s
4e09fb3e27 made_logic 2023-10-24 14:55:52 +00:00
Kruglov Pavel
9a56ec4d63
Merge pull request #55891 from Avogar/try-fix-orc
Try to fix possible segfault in Native ORC input format
2023-10-24 13:57:43 +02:00
Alexey Milovidov
7ec4b99e94 Revert partial result 2023-10-21 03:14:22 +02:00
avogar
6934e27e8b Add union mode for schema inference to infer union schema of files with different schemas 2023-10-20 20:46:41 +00:00
avogar
8cc0dc17eb Try to fix possible segfault in Native ORC input format 2023-10-20 18:50:48 +00:00
yariks5s
87f26f5132 dealt with 2dim arrays 2023-10-20 17:05:05 +00:00
yariks5s
6dc88a4ca4 new changes 2023-10-18 18:02:05 +00:00
avogar
323486f9e8 Add tests 2023-10-17 18:10:47 +00:00
Michael Kolupaev
ce7eca0615
DWARF input format (#55450)
* Add ReadBufferFromFileBase::isRegularLocalFile()

* DWARF input format

* Review comments

* Changed things around ENABLE_EMBEDDED_COMPILER build setting

* Added 'ranges' column

* no-msan no-ubsan
2023-10-16 17:00:07 -07:00
alesapin
3b02748cb6 Fix some typos 2023-10-15 15:43:02 +02:00
Alexander Tokmakov
e3e105d154
Merge pull request #55527 from azat/values-eof-check-fix
Fix checking of non handled data for Values format
2023-10-13 18:07:02 +02:00
yariks5s
cb08da617f added read and parse impl 2023-10-13 15:16:07 +00:00
Alexey Milovidov
8a1363bcf1
Merge pull request #49486 from bigo-sg/test_hive_null_as_default
Set defaults_for_omitted_fields to true for hive text format
2023-10-13 02:01:09 +02:00
yariks5s
9ae025d7e6 mid commit 2023-10-12 17:37:59 +00:00
Azat Khuzhin
2cbb069b68 Add ability to ignore data after semicolon in Values format
This is required for client, to handle comments in multiquery mode.

v0: separate context for input format
v2: cannot use separate context since params and stuff are changed in global context
v3: do not sent this setting to the server (breaks queries for readonly profiles)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-10-12 14:55:26 +02:00
Azat Khuzhin
f379d9cac5 Fix checking of non handled data for Values format
PeekableReadBuffer::hasUnreadData() does not checks the underlying
buffer, and so it simply ignore some issues, like:

    INSERT INTO test_01179_str values ('foo'); ('bar')

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-10-12 12:25:08 +02:00
taiyang-li
7cd24d0af0 solve conflicts 2023-10-12 10:30:05 +08:00
Robert Schulze
97d8e16e8d
Fix MySQL packet row data type
Cf. https://github.com/ClickHouse/ClickHouse/pull/55479#discussion_r1355175581
2023-10-11 15:09:50 +00:00
slvrtrn
e06d3ca1a5 Fix MySQL text protocol DateTime
Introduce `removeLowCardinalityAndNullable` function
Fix incorrect removeLowCar/removeNullable usages
Add more MySQL text protocol tests
Deprecate old Java client tests
Use JDK 17 for test MySQL Java container
2023-10-10 19:51:09 +02:00
robot-clickhouse-ci-1
7a825c1417
Merge pull request #54427 from Avogar/json-object-as-tuple-inference
Add new features to schema inference for JSON formats
2023-09-27 20:12:45 +02:00
Robert Schulze
cde10fe7b5
Merge remote-tracking branch 'rschu1ze/master' into clang-tidy-reenable-checks 2023-09-26 18:59:41 +00:00
Kruglov Pavel
bea80ab5b7
Merge branch 'master' into json-object-as-tuple-inference 2023-09-26 15:23:08 +02:00
Kruglov Pavel
69a17bbef6
Merge pull request #52853 from Avogar/http-valid-json-on-exception
Output valid JSON/XML on excetpion during HTTP query execution
2023-09-26 14:25:55 +02:00
Robert Schulze
9fff447716
Re-enable clang-tidy checks 2023-09-26 09:34:12 +00:00
avogar
9e75825515 Merge branch 'master' of github.com:ClickHouse/ClickHouse into json-object-as-tuple-inference 2023-09-25 17:24:36 +00:00
avogar
42ca897f2d Better schema inference for JSON formats 2023-09-25 15:42:59 +00:00
robot-ch-test-poll4
ba6f0431a5
Merge pull request #54933 from ClibMouse/feature/big-endian-bson-each-row
Provide support for BSON on BE
2023-09-23 03:00:27 +02:00
robot-clickhouse-ci-2
d98234dc9d
Merge pull request #54803 from Avogar/ephemeral-columns-from-files
Forbid special columns for file/s3/url/... storages, fix insert into ephemeral columns from files
2023-09-22 23:24:42 +02:00
kothiga
3e57b007a8
Use LE version of unalignedStore. 2023-09-22 12:25:17 -07:00
kothiga
80d511093b
Provide support for BSON on BE 2023-09-22 09:21:21 -07:00
Robert Schulze
877e4f3aab
Merge remote-tracking branch 'rschu1ze/master' into clang-17 2023-09-21 20:21:12 +00:00
Michael Kolupaev
9af9b4a085
Enable connection pooling for s3 table function (#54812)
Enable connection pooling for s3 table function
2023-09-21 09:27:20 -07:00
Robert Schulze
5209bd2d51
Merge remote-tracking branch 'rschu1ze/master' into clang-17 2023-09-21 14:45:55 +00:00
Robert Schulze
f5137dd0b4
More clang-tidy fixes 2023-09-21 14:40:57 +00:00
avogar
3e08800cb5 Forbid special columns for file/s3/url/... storages, fix insert into ephemeral columns from files 2023-09-20 16:25:55 +00:00
Kruglov Pavel
49ee14f701
Merge pull request #54809 from ClickHouse/pqmeta
Prevent ParquetMetadata reading 40 MB from each file unnecessarily
2023-09-20 12:53:22 +02:00
Michael Kolupaev
c856ec4087 Prevent ParquetMetadata reading 40 MB from each file unnecessarily 2023-09-19 21:58:50 +00:00
avogar
f974970c3c Apply suggestion 2023-09-19 11:53:40 +00:00
avogar
5bd2e9f610 Fix tests 2023-09-19 11:53:40 +00:00
avogar
8c29408f5e Parse data in JSON format as JSONEachRow if failed to parse metadata 2023-09-19 11:53:40 +00:00
Kruglov Pavel
e163670357
Merge branch 'master' into http-valid-json-on-exception 2023-09-19 13:42:53 +02:00
Kruglov Pavel
3c83e43351
Remove debug logging 2023-09-19 13:38:43 +02:00
Robert Schulze
f5e8028bb1
Merge pull request #54642 from rschu1ze/broken-re2st
Remove broken lockless variant of re2
2023-09-17 15:30:57 +02:00
avogar
35d975bfea Add comment in ParallelInputFormat, remove unneded include 2023-09-15 13:07:04 +00:00
Kruglov Pavel
dbd24b240c
Merge branch 'master' into http-valid-json-on-exception 2023-09-15 14:55:31 +02:00
Robert Schulze
7b378dbad3
Remove broken lockless variant of re2 2023-09-14 16:40:42 +00:00
Robert Schulze
a9ae813db0
Merge pull request #54115 from slvrtrn/simplified-prepared-statements-for-mysql
Implement the MySQL binary protocol implementation for initial support of Tableau Online
2023-09-14 12:27:11 +02:00
slvrtrn
c0961d9378 Merge remote-tracking branch 'origin' into simplified-prepared-statements-for-mysql 2023-09-13 19:33:11 +02:00
Arthur Passos
da8caeffd2 Merge branch 'master' into arrow_parquet_account_for_monotonically_increasing_offsets_across_batches 2023-09-12 17:50:36 -03:00
slvrtrn
dddea9219a Address the review comments 2023-09-12 18:39:03 +02:00
slvrtrn
611a75a87f Merge remote-tracking branch 'origin' into simplified-prepared-statements-for-mysql 2023-09-12 10:38:44 +02:00
avogar
803d8dcf85 Support NULL as default for nested types Array/Tuple/Map for input formats 2023-09-11 18:18:33 +00:00
Nikolai Kochetov
9b936c44db
Revert "Revert "Add settings for real-time updates during query execution"" 2023-09-09 12:29:39 +02:00
Alexey Milovidov
03a755732a
Revert "Add settings for real-time updates during query execution" 2023-09-09 03:10:23 +03:00
Nikolai Kochetov
0095124791
Merge pull request #48607 from alexX512/master
Add settings for real-time updates during query execution
2023-09-08 09:05:33 +02:00
Arthur Passos
f34d40cde3 docs 2023-09-06 17:26:34 -03:00
Arthur Passos
a8027be612 fix 2023-09-06 17:25:27 -03:00
Arthur Passos
ce673d6bea arrow-parquet account for monotonically increasing offsets across multiple batches 2023-09-06 13:59:39 -03:00
robot-clickhouse-ci-1
02339a1f22
Merge pull request #54326 from ClickHouse/fix_segfault_system_zookeeper
Fix segfault in system.zookeeper
2023-09-06 02:20:50 +02:00
Alexander Tokmakov
a8489c21da fix segfault in system.zookeeper 2023-09-05 20:23:16 +02:00
slvrtrn
6e0a254368 Update tests, update imports, remove requirements 2023-09-04 21:50:06 +02:00
slvrtrn
bb0eff9669 Revert format changes 2023-09-04 21:15:26 +02:00
irenjj
c9261bbf18 Merge remote-tracking branch 'upstream/master' into feat_markdown 2023-09-04 22:35:53 +08:00
Alexey Milovidov
b660ac9bf1 Merge #54236 2023-09-04 03:57:39 +02:00
Alexey Milovidov
476e15ce3d
Merge pull request #54236 from YinZheng-Sun/Fix
remove semicolon
2023-09-04 04:54:30 +03:00