Commit Graph

239 Commits

Author SHA1 Message Date
kssenii
5cd757dd9a Merge remote-tracking branch 'origin/master' into s3-queue-parallelize-ordered-mode 2024-01-26 16:33:22 +01:00
Maksim Kita
2a327107b6 Updated implementation 2024-01-25 14:31:49 +03:00
kssenii
7f8f379d7f Parallel & disrtibuted processing for ordered mode 2024-01-24 16:32:15 +01:00
Nikolai Kochetov
eff6232418 Merge branch 'master' into try-to-remove-pk-analysis-on-ast 2024-01-05 10:54:46 +00:00
Nikolai Kochetov
7a271f09ed Check if I can remove KeyCondition analysis on AST. 2024-01-03 17:50:46 +00:00
Nikolai Kochetov
c808b03e55 Remove unneeded code 2024-01-02 17:27:33 +00:00
Nikolai Kochetov
b95bdef09e Update StorageS3 and StorageS3Cluster 2023-12-29 17:41:11 +00:00
avogar
ee7af95bc0 Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-union 2023-12-08 20:29:28 +00:00
avogar
4d9a1b50f9 Add information about new _size virtual column in file/s3/url/hdfs/azure table functions 2023-11-28 18:15:07 +00:00
Kruglov Pavel
b84e3cf683
Merge branch 'master' into size-virtual-column 2023-11-22 19:25:00 +01:00
avogar
007353a2dd Add _size virtual column to s3/file/hdfs/url/azureBlobStorage engines 2023-11-22 18:12:36 +00:00
vdimir
ffbe85d3a0
Merge pull request #56668 from ClickHouse/vdimir/analyzer_s3_partition_pruning
Analyzer: filtering by virtual columns for StorageS3
2023-11-22 16:44:44 +01:00
vdimir
15234474d7
Implement system table blob_storage_log 2023-11-21 09:18:25 +00:00
vdimir
a915eeded8
StorageS3 use filters from SourceStepWithFilter 2023-11-20 17:59:58 +00:00
vdimir
cbb2e02c03
Analyzer: partition pruning for S3 2023-11-20 17:59:53 +00:00
avogar
f537bad469 Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-union 2023-11-20 14:32:50 +00:00
Sema Checherinda
f999337dae
Revert "Revert "s3 adaptive timeouts"" 2023-11-20 14:53:22 +01:00
Alexander Tokmakov
5031f239c3
Revert "s3 adaptive timeouts" 2023-11-20 14:28:59 +01:00
Sema Checherinda
8d36fd6e54 get rid off of client_with_long_timeout_ptr 2023-11-14 11:34:12 +01:00
Kruglov Pavel
570b66f027
Merge branch 'master' into schema-inference-union 2023-10-26 19:26:00 +02:00
李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) (#55330)
* support orc filter push down

* update orc lib version

* replace setqueryinfo with setkeycondition

* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536

* refactor source with key condition

* fix building error

* remove std::cout

* update orc

* update orc version

* fix bugs

* improve code

* upgrade orc lib

* fix code style

* change as requested

* add performance tests for orc filter push down

* add performance tests for orc filter push down

* fix all bugs

* fix default as null issue

* add uts for null as default issues

* upgrade orc lib

* fix failed orc lib uts and fix typo

* fix failed uts

* fix failed uts

* fix ast fuzzer tests

* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html

* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm

* fix wrong performance tests

* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html

* add some comments

* add some comments

* inline range::equals and range::less

* fix data race of key condition

* trigger ci
2023-10-24 12:08:17 -07:00
Kruglov Pavel
6f61ccfe28
Merge branch 'master' into schema-inference-union 2023-10-20 22:54:11 +02:00
avogar
6934e27e8b Add union mode for schema inference to infer union schema of files with different schemas 2023-10-20 20:46:41 +00:00
kssenii
4a7922507b Minor changes 2023-09-28 16:18:00 +02:00
kssenii
f753b91a3b Better maintenance of processing node 2023-09-27 17:17:52 +02:00
kssenii
6b191a1afe Better 2023-09-27 14:54:31 +02:00
pufit
4a2f7976f0 Resolve PR issues 2023-09-20 19:43:02 -04:00
pufit
34aecc0bf3 Adjusting num_streams by expected work in StorageS3 2023-09-19 23:05:48 -04:00
avogar
2d8f33bfa2 Fix parsing error in WithNames formats while reading subset of columns with disabled input_format_with_names_use_header 2023-09-11 14:55:37 +00:00
Kruglov Pavel
592fa77987
Merge branch 'master' into cache-count 2023-08-23 15:18:02 +02:00
robot-ch-test-poll1
c22ffa6195
Merge pull request #53529 from Avogar/filter-files-all-table-functions
Use filter by file/path before reading in url/file/hdfs table functins
2023-08-23 14:21:23 +02:00
Kruglov Pavel
c0bdd0e00b
Merge branch 'master' into cache-count 2023-08-22 14:42:22 +02:00
avogar
b4145aeddc Cache number of rows in files for count in file/s3/url/hdfs/azure functions 2023-08-22 11:59:59 +00:00
pufit
9d454d9afc Merge branch 'master' into pufit/fix_s3_threads
# Conflicts:
#	src/Storages/StorageS3.cpp
#	src/Storages/StorageS3.h
#	src/Storages/StorageURL.cpp
#	src/Storages/StorageURL.h
2023-08-21 21:32:15 -04:00
pufit
98a701e2c1 Limiting number of parsing threads for S3 source 2023-08-21 21:21:03 -04:00
Michael Kolupaev
2f4d433e69 Parquet filter pushdown 2023-08-21 14:15:52 -07:00
avogar
60b0b88d50 Clean up 2023-08-17 16:59:57 +00:00
avogar
4c32097df3 Use filter by file/path before reading in url/file/hdfs table functions, reduce code duplication 2023-08-17 16:54:43 +00:00
Anton Popov
ff137773e7
Merge branch 'master' into formats-with-subcolumns 2023-08-02 15:24:56 +02:00
kssenii
870a506a0b Some fixes 2023-07-31 20:07:23 +02:00
kssenii
c13fdca23e Merge remote-tracking branch 'upstream/master' into s3queue 2023-07-31 15:32:56 +02:00
Kruglov Pavel
0d34e97dbe
Merge branch 'master' into formats-with-subcolumns 2023-07-26 13:30:35 +02:00
Kruglov Pavel
64e88cde21
Merge branch 'master' into better-progress-bar-2 2023-07-18 13:37:53 +02:00
Kruglov Pavel
1dd05319b5
Merge branch 'master' into formats-with-subcolumns 2023-07-17 19:13:42 +02:00
kssenii
7359dd518f Merge remote-tracking branch 'upstream/master' into s3queue 2023-07-17 14:23:12 +02:00
avogar
98aa6b317f Support reading subcolumns from file/s3/hdfs/url/azureBlobStorage table functions 2023-07-04 21:17:26 +00:00
kssenii
5de869760d Merge remote-tracking branch 'upstream/master' into s3queue 2023-06-30 13:56:43 +02:00
avogar
c679dd400e Make better 2023-06-23 13:43:40 +00:00
avogar
cf082f2f9a Use read_bytes/total_bytes_to_read for progress bar in s3/file/url/... table functions 2023-06-22 17:24:43 +00:00
Michael Kolupaev
4a570a05c9 Decrease default timeouts for S3 and HTTP requests 2023-06-21 18:08:50 +00:00