李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) ( #55330 )
...
* support orc filter push down
* update orc lib version
* replace setqueryinfo with setkeycondition
* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536
* refactor source with key condition
* fix building error
* remove std::cout
* update orc
* update orc version
* fix bugs
* improve code
* upgrade orc lib
* fix code style
* change as requested
* add performance tests for orc filter push down
* add performance tests for orc filter push down
* fix all bugs
* fix default as null issue
* add uts for null as default issues
* upgrade orc lib
* fix failed orc lib uts and fix typo
* fix failed uts
* fix failed uts
* fix ast fuzzer tests
* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html
* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm
* fix wrong performance tests
* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html
* add some comments
* add some comments
* inline range::equals and range::less
* fix data race of key condition
* trigger ci
2023-10-24 12:08:17 -07:00
avogar
2d8f33bfa2
Fix parsing error in WithNames formats while reading subset of columns with disabled input_format_with_names_use_header
2023-09-11 14:55:37 +00:00
avogar
894513f6cd
Fix tests
2023-08-23 18:43:08 +00:00
Kruglov Pavel
592fa77987
Merge branch 'master' into cache-count
2023-08-23 15:18:02 +02:00
robot-ch-test-poll1
c22ffa6195
Merge pull request #53529 from Avogar/filter-files-all-table-functions
...
Use filter by file/path before reading in url/file/hdfs table functins
2023-08-23 14:21:23 +02:00
Kruglov Pavel
c0bdd0e00b
Merge branch 'master' into cache-count
2023-08-22 14:42:22 +02:00
avogar
b4145aeddc
Cache number of rows in files for count in file/s3/url/hdfs/azure functions
2023-08-22 11:59:59 +00:00
pufit
9d454d9afc
Merge branch 'master' into pufit/fix_s3_threads
...
# Conflicts:
# src/Storages/StorageS3.cpp
# src/Storages/StorageS3.h
# src/Storages/StorageURL.cpp
# src/Storages/StorageURL.h
2023-08-21 21:32:15 -04:00
pufit
98a701e2c1
Limiting number of parsing threads for S3 source
2023-08-21 21:21:03 -04:00
Michael Kolupaev
2f4d433e69
Parquet filter pushdown
2023-08-21 14:15:52 -07:00
avogar
60b0b88d50
Clean up
2023-08-17 16:59:57 +00:00
avogar
4c32097df3
Use filter by file/path before reading in url/file/hdfs table functions, reduce code duplication
2023-08-17 16:54:43 +00:00
Kruglov Pavel
0d34e97dbe
Merge branch 'master' into formats-with-subcolumns
2023-07-26 13:30:35 +02:00
avogar
8d634c992b
Fix tests
2023-07-06 17:47:01 +00:00
avogar
d11cd0dc30
Fix tests
2023-07-05 17:56:03 +00:00
avogar
98aa6b317f
Support reading subcolumns from file/s3/hdfs/url/azureBlobStorage table functions
2023-07-04 21:17:26 +00:00
avogar
4eeb431003
Merge branch 'master' of github.com:ClickHouse/ClickHouse into better-progress-bar-2
2023-06-28 18:53:08 +00:00
avogar
c679dd400e
Make better
2023-06-23 13:43:40 +00:00
avogar
cf082f2f9a
Use read_bytes/total_bytes_to_read for progress bar in s3/file/url/... table functions
2023-06-22 17:24:43 +00:00
Sema Checherinda
d0bb985061
fix other classes based on SinkToStorage
2023-06-22 14:33:25 +02:00
Sema Checherinda
95349a405b
release buffers with exception context
2023-06-22 13:00:13 +02:00
avogar
3209ebe34b
Improve progress bar for file/s3/hdfs/url table functions. Step 1
2023-06-16 15:51:18 +00:00
avogar
2e1f56ae33
Address comments
2023-06-13 14:43:50 +00:00
Kruglov Pavel
bf28074d32
Merge branch 'master' into allow-skip-empty-files
2023-06-08 12:36:18 +02:00
Antonio Andelic
b11f744252
Correctly disable async insert with deduplication when it's not needed ( #50663 )
...
* Correctly disable async insert when it's not used
* Better
* Add comment
* Better
* Fix tests
---------
Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-06-07 20:33:08 +02:00
Michael Kolupaev
b51064a508
Get rid of SeekableReadBufferFactory, add SeekableReadBuffer::readBigAt() instead
2023-06-01 18:48:30 -07:00
avogar
d4efbbfbd3
Allow to skip empty files in file/s3/url/hdfs table functions
2023-05-30 19:32:24 +00:00
avogar
88e4c93abc
Merge branch 'master' of github.com:ClickHouse/ClickHouse into urlCluster
2023-05-22 19:19:57 +00:00
Nikolay Degterinsky
d4b89cb643
Merge pull request #49356 from Ziy1-Tan/vcol
...
Support for `_path` and `_file` virtual columns for table function `url`.
2023-05-22 18:10:32 +02:00
avogar
3ee8de792c
Merge branch 'master' of github.com:ClickHouse/ClickHouse into urlCluster
2023-05-11 12:46:20 +00:00
Michael Kolupaev
3bd1489f18
Propagate input_format_parquet_preserve_order to parallelizeOutputAfterReading()
2023-05-05 04:20:27 +00:00
Michael Kolupaev
eb3b774ad0
Better control over Parquet row group size
2023-05-04 14:59:55 -07:00
Ziy1-Tan
2c159061ed
Support _path
and _file
virtual columns for table function url
.
2023-05-01 21:40:30 +08:00
avogar
447189a6ca
Better
2023-04-21 17:54:09 +00:00
avogar
0097230611
Better
2023-04-21 17:35:17 +00:00
avogar
944f54aadf
Finish urlCluster, refactor code, reduce code duplication
2023-04-21 17:24:37 +00:00
avogar
86686fbbc3
Fix conflicts
2023-04-21 14:11:18 +02:00
kssenii
bb0beb7449
Merge remote-tracking branch 'upstream/master' into named-collections-finish
2023-03-17 13:02:36 +01:00
Konstantin Bogdanov
1bbf5acd47
Pass headers from StorageURL to WriteBufferFromHTTP ( #46996 )
...
* Pass headers from StorageURL to WriteBufferFromHTTP
* Add a test
* Lint
* `time.sleep(1)`
* Start echo server earlier
* Add proper handling for mock server start
* Automatic style fix
---------
Co-authored-by: robot-clickhouse <robot-clickhouse@users.noreply.github.com>
2023-03-03 13:55:52 +01:00
kssenii
ad88251ee7
Fix tests
2023-02-27 17:42:04 +01:00
kssenii
68e06ecb99
Replace for table function remote, and external storage
2023-02-21 14:33:37 +01:00
kssenii
a54b011670
Finish for mysql
2023-02-20 21:37:38 +01:00
kssenii
ab0dedf0c8
Simplify code around storage s3 configuration
2023-02-06 16:23:17 +01:00
attack204
1f4139718a
fix:style
2023-01-19 16:19:39 +08:00
attack204
f549380867
fix:style
2023-01-19 16:10:59 +08:00
attack204
e312cfa794
feature:urlCluster
2023-01-19 10:19:04 +08:00
kssenii
30547d2dcd
Replace old named collections code for url
2022-12-17 00:24:05 +01:00
Azat Khuzhin
4e76629aaf
Fixes for -Wshorten-64-to-32
...
- lots of static_cast
- add safe_cast
- types adjustments
- config
- IStorage::read/watch
- ...
- some TODO's (to convert types in future)
P.S. That was quite a journey...
v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Alexey Milovidov
6e564b18bf
Merge pull request #40600 from FrankChen021/check_url_arg
...
Validate the CompressionMethod parameter of URL table engine
2022-08-27 19:29:55 +03:00
Frank Chen
c9ea4f9f77
Change compression_method from String to CompressionMethod
2022-08-25 19:18:04 +08:00