Commit Graph

543 Commits

Author SHA1 Message Date
Sema Checherinda
a950595c24
Merge pull request #56314 from CheSema/s3-aggressive-timeouts
s3 adaptive timeouts
2023-11-19 14:12:14 +01:00
Alexey Milovidov
d56cbda185 Add metrics for the number of queued jobs, which is useful for the IO thread pool 2023-11-18 19:07:59 +01:00
Sema Checherinda
8d36fd6e54 get rid off of client_with_long_timeout_ptr 2023-11-14 11:34:12 +01:00
Sema Checherinda
27fb25d056 alter the naming, fix client_with_long_timeout in s3 storage 2023-11-14 11:34:12 +01:00
Alexey Milovidov
8c253b9e3e Remove C++ templates 2023-11-10 05:25:02 +01:00
李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) (#55330)
* support orc filter push down

* update orc lib version

* replace setqueryinfo with setkeycondition

* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536

* refactor source with key condition

* fix building error

* remove std::cout

* update orc

* update orc version

* fix bugs

* improve code

* upgrade orc lib

* fix code style

* change as requested

* add performance tests for orc filter push down

* add performance tests for orc filter push down

* fix all bugs

* fix default as null issue

* add uts for null as default issues

* upgrade orc lib

* fix failed orc lib uts and fix typo

* fix failed uts

* fix failed uts

* fix ast fuzzer tests

* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html

* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm

* fix wrong performance tests

* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html

* add some comments

* add some comments

* inline range::equals and range::less

* fix data race of key condition

* trigger ci
2023-10-24 12:08:17 -07:00
kssenii
42ed249954 Fix build 2023-10-17 12:03:49 +02:00
kssenii
4464c86895 Merge remote-tracking branch 'origin/master' into s3-queue-fixes 2023-10-17 11:16:52 +02:00
Michael Kolupaev
ce7eca0615
DWARF input format (#55450)
* Add ReadBufferFromFileBase::isRegularLocalFile()

* DWARF input format

* Review comments

* Changed things around ENABLE_EMBEDDED_COMPILER build setting

* Added 'ranges' column

* no-msan no-ubsan
2023-10-16 17:00:07 -07:00
Kseniia Sumarokova
96c518be5b
Merge branch 'master' into s3-queue-fixes 2023-10-16 22:19:13 +02:00
Kruglov Pavel
836e35b6c4
Fix progress bar for s3 and azure Cluster functions with url without globs 2023-10-16 12:38:10 +02:00
kssenii
d64b990712 Merge remote-tracking branch 'origin/master' into s3-queue-fixes 2023-10-13 12:13:56 +02:00
kssenii
d644992192 Fxi 2023-09-28 16:25:04 +02:00
kssenii
f753b91a3b Better maintenance of processing node 2023-09-27 17:17:52 +02:00
kssenii
6b191a1afe Better 2023-09-27 14:54:31 +02:00
Robert Schulze
9fff447716
Re-enable clang-tidy checks 2023-09-26 09:34:12 +00:00
robot-ch-test-poll2
3c93a939a2
Merge pull request #54936 from ClickHouse/pufit/s3-yet-another-num-streams-adjustment
Set a minimum limit of `num_streams` in StorageS3
2023-09-23 02:59:50 +02:00
robot-clickhouse-ci-2
d98234dc9d
Merge pull request #54803 from Avogar/ephemeral-columns-from-files
Forbid special columns for file/s3/url/... storages, fix insert into ephemeral columns from files
2023-09-22 23:24:42 +02:00
pufit
13eee0e950 Set a minimum limit of num_streams in StorageS3 2023-09-22 14:13:20 -04:00
pufit
99c1d76604 Fix division by zero in StorageS3 2023-09-21 15:55:42 -04:00
Michael Kolupaev
9af9b4a085
Enable connection pooling for s3 table function (#54812)
Enable connection pooling for s3 table function
2023-09-21 09:27:20 -07:00
pufit
bd387f6d2c
Merge pull request #54815 from ClickHouse/pufit/optimal-num-streams-for-s3-storage
Adjusting `num_streams` by expected work in StorageS3
2023-09-21 08:41:39 -04:00
pufit
4a2f7976f0 Resolve PR issues 2023-09-20 19:43:02 -04:00
pufit
20105958a8 add reserve 2023-09-20 13:37:06 -04:00
pufit
71c7e3c81e Add logging, fix thread name length 2023-09-20 13:33:25 -04:00
avogar
3e08800cb5 Forbid special columns for file/s3/url/... storages, fix insert into ephemeral columns from files 2023-09-20 16:25:55 +00:00
pufit
729c8aa29f fix glob iterator estimated objects 2023-09-20 10:41:47 -04:00
pufit
34aecc0bf3 Adjusting num_streams by expected work in StorageS3 2023-09-19 23:05:48 -04:00
Sema Checherinda
e7550523c8
Merge pull request #54651 from CheSema/limit_backoff_timeout
limit the delay before next try in S3
2023-09-18 19:41:26 +02:00
Robert Schulze
f5e8028bb1
Merge pull request #54642 from rschu1ze/broken-re2st
Remove broken lockless variant of re2
2023-09-17 15:30:57 +02:00
Sema Checherinda
d9e15c00c9 limit the delay before next try in S3 2023-09-14 19:45:07 +02:00
Robert Schulze
7b378dbad3
Remove broken lockless variant of re2 2023-09-14 16:40:42 +00:00
avogar
2d8f33bfa2 Fix parsing error in WithNames formats while reading subset of columns with disabled input_format_with_names_use_header 2023-09-11 14:55:37 +00:00
Kruglov Pavel
44db5fa992
Merge branch 'master' into cache-count 2023-08-24 17:21:18 +02:00
Arthur Passos
2bade7db08
Add global proxy setting (#51749)
* initial impl

* fix env ut

* move ut directory

* make sure no null proxy resolver is returned by ProxyConfigurationResolverProvider

* minor adjustment

* add a few tests, still incomplete

* add proxy support for url table function

* use proxy for select from url as well

* remove optional from return type, just returns empty config

* fix style

* style

* black

* ohg boy

* rm in progress file

* god pls don't let me kill anyone

* ...

* add use_aws guards

* remove hard coded s3 proxy resolver

* add concurrency-mt-unsafe

* aa

* black

* add logging back

* revert change

* imrpove code a bit

* helper functions and separate tests

* for some reason, this env test is not working..

* formatting

* :)

* clangtidy

* lint

* revert some stupid things

* small test adjusmtments

* simplify tests

* rename test

* remove extra line

* freaking style change

* simplify a bit

* fix segfault & remove an extra call

* tightly couple proxy provider with context..

* remove useless include

* rename config prefix parameter

* simplify provider a bit

* organize provider a bit

* add a few comments

* comment out proxy env tests

* fix nullptr in unit tests

* make sure old storage proxy config is properly covered without global context instance

* move a few functions from class to anonymous namespace

* fix no fallback for specific storage conf

* change API to accept http method instead of bool

* implement http/https distinction in listresolver, any still not implemented

* implement http/https distinction in remote resolver

* progress on code, improve tests and add url function working test

* use protcol instead of method for http and https

* small fix

* few more adjustments

* fix style

* black

* move enum to proxyconfiguration

* wip

* fix build

* fix ut

* delete atomicroundrobin class

* remove stale include

* add some tests.. need to spend some more time on the design..

* change design a bit

* progress

* use existing context for tests

* rename aux function and fix ut

* ..

* rename test

* try to simplify tests a bit

* simplify tests a bit more

* attempt to fix tests, accept more than one remote resolver

* use proper log id

* try waiting for resolver

* proper wait logic

* black

* empty

* address a few comments

* refactor tests

* remove old tests

* baclk

* use RAII to set/unset env

* black

* clang tidy

* fix env proxy not respecting any

* use log trace

* fix wrong logic in getRemoteREsolver

* fix wrong logic in getRemoteREsolver

* fix test

* remove unwanted code

* remove ClientConfigurationperRequest and auxilary classes

* remove unwanted code

* remove adapter test

* few adjustments and add test for s3 storage conf  with new proxy settings

* black

* use chassert for context

* Add getenv comment
2023-08-24 16:07:26 +03:00
Kruglov Pavel
f7e1abd774
Merge branch 'master' into cache-count 2023-08-23 22:31:49 +02:00
Kruglov Pavel
592fa77987
Merge branch 'master' into cache-count 2023-08-23 15:18:02 +02:00
Kruglov Pavel
7e362a2110
Merge branch 'master' into fast-count-from-files 2023-08-23 15:13:20 +02:00
robot-ch-test-poll1
c22ffa6195
Merge pull request #53529 from Avogar/filter-files-all-table-functions
Use filter by file/path before reading in url/file/hdfs table functins
2023-08-23 14:21:23 +02:00
Kruglov Pavel
e193aec583
Merge branch 'master' into fast-count-from-files 2023-08-23 12:15:34 +02:00
pufit
e42da9411b Fix variables names 2023-08-22 11:23:10 -04:00
Kruglov Pavel
de960f3c35
Fix style 2023-08-22 15:22:17 +02:00
Kruglov Pavel
67c5c0203b
Merge branch 'master' into fast-count-from-files 2023-08-22 15:03:48 +02:00
avogar
7f9e81d504 Clean up 2023-08-22 12:55:00 +00:00
Kruglov Pavel
c0bdd0e00b
Merge branch 'master' into cache-count 2023-08-22 14:42:22 +02:00
avogar
b4145aeddc Cache number of rows in files for count in file/s3/url/hdfs/azure functions 2023-08-22 11:59:59 +00:00
pufit
9d454d9afc Merge branch 'master' into pufit/fix_s3_threads
# Conflicts:
#	src/Storages/StorageS3.cpp
#	src/Storages/StorageS3.h
#	src/Storages/StorageURL.cpp
#	src/Storages/StorageURL.h
2023-08-21 21:32:15 -04:00
pufit
98a701e2c1 Limiting number of parsing threads for S3 source 2023-08-21 21:21:03 -04:00
Michael Kolupaev
2f4d433e69 Parquet filter pushdown 2023-08-21 14:15:52 -07:00
avogar
47304bf7aa Optimize count from files in most input formats 2023-08-21 12:30:52 +00:00