Commit Graph

367 Commits

Author SHA1 Message Date
robot-ch-test-poll1
c22ffa6195
Merge pull request #53529 from Avogar/filter-files-all-table-functions
Use filter by file/path before reading in url/file/hdfs table functins
2023-08-23 14:21:23 +02:00
Kruglov Pavel
e193aec583
Merge branch 'master' into fast-count-from-files 2023-08-23 12:15:34 +02:00
pufit
e42da9411b Fix variables names 2023-08-22 11:23:10 -04:00
Kruglov Pavel
a34c9c3ebc
Fix build 2023-08-22 15:21:54 +02:00
Kruglov Pavel
67c5c0203b
Merge branch 'master' into fast-count-from-files 2023-08-22 15:03:48 +02:00
Kruglov Pavel
c0bdd0e00b
Merge branch 'master' into cache-count 2023-08-22 14:42:22 +02:00
avogar
b4145aeddc Cache number of rows in files for count in file/s3/url/hdfs/azure functions 2023-08-22 11:59:59 +00:00
pufit
9d454d9afc Merge branch 'master' into pufit/fix_s3_threads
# Conflicts:
#	src/Storages/StorageS3.cpp
#	src/Storages/StorageS3.h
#	src/Storages/StorageURL.cpp
#	src/Storages/StorageURL.h
2023-08-21 21:32:15 -04:00
pufit
98a701e2c1 Limiting number of parsing threads for S3 source 2023-08-21 21:21:03 -04:00
Michael Kolupaev
2f4d433e69 Parquet filter pushdown 2023-08-21 14:15:52 -07:00
avogar
47304bf7aa Optimize count from files in most input formats 2023-08-21 12:30:52 +00:00
avogar
8e445b5461 Fixes 2023-08-18 17:49:40 +00:00
avogar
4c32097df3 Use filter by file/path before reading in url/file/hdfs table functions, reduce code duplication 2023-08-17 16:54:43 +00:00
Anton Popov
ff137773e7
Merge branch 'master' into formats-with-subcolumns 2023-08-02 15:24:56 +02:00
avogar
b5fc34b770 Rename setting disable_url_encoding to enable_url_encoding and add a test 2023-07-27 12:20:33 +00:00
Kruglov Pavel
0d34e97dbe
Merge branch 'master' into formats-with-subcolumns 2023-07-26 13:30:35 +02:00
Kruglov Pavel
15cc046883
Merge branch 'master' into better-progress-bar-2 2023-07-26 13:12:24 +02:00
Alexey Milovidov
168b84a592
Merge pull request #52337 from Avogar/no-decode-url
Allow to disable decoding/encoding path in uri in URL engine
2023-07-25 05:43:06 +03:00
Kruglov Pavel
fec5675cd4
Merge branch 'master' into better-progress-bar-2 2023-07-24 19:59:38 +02:00
avogar
fe934d3059 Make better 2023-07-20 12:38:41 +00:00
avogar
2b8e4ebd4c Allow to disable decoding/encoding path in uri in URL engine 2023-07-19 19:48:39 +00:00
avogar
8d634c992b Fix tests 2023-07-06 17:47:01 +00:00
avogar
d11cd0dc30 Fix tests 2023-07-05 17:56:03 +00:00
Kruglov Pavel
a2805f8f44
Merge branch 'master' into formats-with-subcolumns 2023-07-04 23:27:03 +02:00
avogar
98aa6b317f Support reading subcolumns from file/s3/hdfs/url/azureBlobStorage table functions 2023-07-04 21:17:26 +00:00
Nikolay Degterinsky
8dfa773f44
Merge branch 'master' into headers-blacklist 2023-06-30 23:40:17 +02:00
avogar
4eeb431003 Merge branch 'master' of github.com:ClickHouse/ClickHouse into better-progress-bar-2 2023-06-28 18:53:08 +00:00
Alexey Milovidov
b8e6bd3299
Merge branch 'master' into refactor-subqueries-for-in 2023-06-26 06:05:12 +03:00
avogar
c679dd400e Make better 2023-06-23 13:43:40 +00:00
Sema Checherinda
977cd03cf2
Merge branch 'master' into memory-leak 2023-06-23 15:35:53 +02:00
avogar
24fab7bfde Remove old includes 2023-06-22 18:48:15 +00:00
avogar
cf082f2f9a Use read_bytes/total_bytes_to_read for progress bar in s3/file/url/... table functions 2023-06-22 17:24:43 +00:00
Michael Kolupaev
2498170253 Fix use-after-free in StorageURL when switching URLs 2023-06-22 16:24:12 +00:00
Sema Checherinda
d0bb985061 fix other classes based on SinkToStorage 2023-06-22 14:33:25 +02:00
Sema Checherinda
95349a405b release buffers with exception context 2023-06-22 13:00:13 +02:00
Nikolai Kochetov
a940031878 Merge branch 'master' into refactor-subqueries-for-in 2023-06-22 12:18:48 +02:00
Nikolai Kochetov
afa74f697c Refactor a bit. 2023-06-16 19:38:50 +00:00
avogar
3209ebe34b Improve progress bar for file/s3/hdfs/url table functions. Step 1 2023-06-16 15:51:18 +00:00
Nikolay Degterinsky
9a25958be8 Add HTTP header filtering 2023-06-15 13:49:49 +00:00
avogar
870f3d1270 Fix comments 2023-06-15 12:59:46 +00:00
avogar
2e1f56ae33 Address comments 2023-06-13 14:43:50 +00:00
Kruglov Pavel
bf28074d32
Merge branch 'master' into allow-skip-empty-files 2023-06-08 12:36:18 +02:00
Antonio Andelic
b11f744252
Correctly disable async insert with deduplication when it's not needed (#50663)
* Correctly disable async insert when it's not used

* Better

* Add comment

* Better

* Fix tests

---------

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-06-07 20:33:08 +02:00
Michael Kolupaev
b51064a508 Get rid of SeekableReadBufferFactory, add SeekableReadBuffer::readBigAt() instead 2023-06-01 18:48:30 -07:00
avogar
0b62be649f Add docs, fix style 2023-05-31 17:52:29 +00:00
Kruglov Pavel
0beca0336d
Merge pull request #49112 from ClickHouse/Avogar-patch-3
Fix possible terminate called for uncaught exception in some places
2023-05-31 16:55:43 +02:00
avogar
d4efbbfbd3 Allow to skip empty files in file/s3/url/hdfs table functions 2023-05-30 19:32:24 +00:00
avogar
88e4c93abc Merge branch 'master' of github.com:ClickHouse/ClickHouse into urlCluster 2023-05-22 19:19:57 +00:00
Nikolay Degterinsky
d4b89cb643
Merge pull request #49356 from Ziy1-Tan/vcol
Support for `_path` and `_file` virtual columns for table function `url`.
2023-05-22 18:10:32 +02:00
avogar
3ee8de792c Merge branch 'master' of github.com:ClickHouse/ClickHouse into urlCluster 2023-05-11 12:46:20 +00:00
Michael Kolupaev
3bd1489f18 Propagate input_format_parquet_preserve_order to parallelizeOutputAfterReading() 2023-05-05 04:20:27 +00:00
Michael Kolupaev
eb3b774ad0 Better control over Parquet row group size 2023-05-04 14:59:55 -07:00
Ziy1-Tan
1bb0d1519e Fix style
Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
2023-05-02 16:54:14 +08:00
Ziy1-Tan
c93ceedbef Fix style
Signed-off-by: Ziy1-Tan <ajb459684460@gmail.com>
2023-05-02 10:38:37 +08:00
Ziy1-Tan
2c159061ed Support _path and _file virtual columns for table function url. 2023-05-01 21:40:30 +08:00
Kruglov Pavel
75a3b6c322
Fix build 2023-04-24 21:08:53 +02:00
Kruglov Pavel
8ff864cd8b
Fix 2023-04-24 19:12:50 +02:00
avogar
c503f6532c Add more finalize() to avoid terminate 2023-04-24 15:11:36 +00:00
avogar
0097230611 Better 2023-04-21 17:35:17 +00:00
avogar
0805b517ee Fix parsing failover options 2023-04-21 17:28:14 +00:00
avogar
944f54aadf Finish urlCluster, refactor code, reduce code duplication 2023-04-21 17:24:37 +00:00
avogar
c949f0ebf5 Merge branch 'master' of github.com:ClickHouse/ClickHouse into urlCluster 2023-04-21 14:13:33 +02:00
avogar
86686fbbc3 Fix conflicts 2023-04-21 14:11:18 +02:00
Michael Kolupaev
87be78e6de Better 2023-04-17 04:58:32 +00:00
Michael Kolupaev
e133633359 Parallel decoding with one row group per thread 2023-04-17 04:58:32 +00:00
Michael Kolupaev
683077890f Highly questionable refactoring (getInputMultistream() nonsense) 2023-04-17 04:58:32 +00:00
Michael Kolupaev
2d4fe85513 Something 2023-04-17 04:58:32 +00:00
kssenii
bb0beb7449 Merge remote-tracking branch 'upstream/master' into named-collections-finish 2023-03-17 13:02:36 +01:00
Antonio Andelic
a70ca31884 Merge branch 'master' into fix-url-progress-bar 2023-03-09 10:17:33 +00:00
kssenii
8f2d75cef8 Fix tests 2023-03-05 12:56:00 +01:00
Konstantin Bogdanov
1bbf5acd47
Pass headers from StorageURL to WriteBufferFromHTTP (#46996)
* Pass headers from StorageURL to WriteBufferFromHTTP

* Add a test

* Lint

* `time.sleep(1)`

* Start echo server earlier

* Add proper handling for mock server start

* Automatic style fix

---------

Co-authored-by: robot-clickhouse <robot-clickhouse@users.noreply.github.com>
2023-03-03 13:55:52 +01:00
Antonio Andelic
f540f7f6f9 Fix some tests 2023-03-01 12:45:00 +00:00
Antonio Andelic
45dc5dc25d No progress bar if no size 2023-02-28 15:25:28 +00:00
Antonio Andelic
56a126f7af Fix progress bar with URL 2023-02-24 14:49:14 +00:00
Robert Schulze
10af0b3e49
Reduce redundancies 2023-02-07 12:27:23 +00:00
Robert Schulze
84b9ff450f
Fix terribly broken, fragile and potentially cyclic linking
Sorry for the clickbaity title. This is about static method
ConnectionTimeouts::getHTTPTimeouts(). It was be declared in header
IO/ConnectionTimeouts.h, and defined in header
IO/ConnectionTimeoutsContext.h (!). This is weird and caused issues with
linking on s390x (##45520). There was an attempt to fix some
inconsistencies (#45848) but neither did @Algunenano nor me at first
really understand why the definition is in the header.

Turns out that ConnectionTimeoutsContext.h is only #include'd from
source files which are part of the normal server build BUT NOT part of
the keeper standalone build (which must be enabled via CMake
-DBUILD_STANDALONE_KEEPER=1). This dependency was not documented and as
a result, some misguided workarounds were introduced earlier, e.g.
0341c6c54b

The deeper cause was that getHTTPTimeouts() is passed a "Context". This
class is part of the "dbms" libary which is deliberately not linked by
the standalone build of clickhouse-keeper. The context is only used to
read the settings and the "Settings" class is part of the
clickhouse_common library which is linked by clickhouse-keeper already.

To resolve this mess, this PR

- creates source file IO/ConnectionTimeouts.cpp and moves all
  ConnectionTimeouts definitions into it, including getHTTPTimeouts().

- breaks the wrong dependency by passing "Settings" instead of "Context"
  into getHTTPTimeouts().

- resolves the previous hacks
2023-02-05 20:49:34 +00:00
attack204
7ed6bad097
Merge branch 'master' into urlCluster 2023-02-02 21:12:20 +08:00
Anton Popov
5c0307bc6a fix race in StorageURL and StorageHDFS 2023-01-24 12:34:43 +00:00
attack204
1f4139718a fix:style 2023-01-19 16:19:39 +08:00
attack204
f549380867 fix:style 2023-01-19 16:10:59 +08:00
attack204
7bd0010c50 fix conflict 2023-01-19 13:18:07 +08:00
attack204
31fa490822 fix conflict 2023-01-19 10:45:14 +08:00
attack204
e312cfa794 feature:urlCluster 2023-01-19 10:19:04 +08:00
Alexander Tokmakov
2d7773fccc Merge branch 'master' into text_log_add_pattern 2023-01-13 20:33:46 +01:00
kssenii
fcb042d80c Fix cland tidy 2023-01-05 12:04:07 +01:00
kssenii
67509aa2d5 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2023-01-03 16:41:30 +01:00
Kruglov Pavel
966f57ef68
Merge pull request #42777 from Avogar/improve-streaming-engines
Refactor and Improve streaming engines Kafka/RabbitMQ/NATS and data formats
2023-01-02 15:59:06 +01:00
Alexander Tokmakov
ca989e9212 less runtime format strings 2022-12-23 19:50:34 +01:00
kssenii
8d9bf77588 Fix tests 2022-12-19 13:56:23 +01:00
kssenii
326747b555 Fix tests 2022-12-19 12:16:50 +01:00
kssenii
0c63ce9731 Remove some more old code 2022-12-17 00:34:29 +01:00
kssenii
30547d2dcd Replace old named collections code for url 2022-12-17 00:24:05 +01:00
Kruglov Pavel
c5b2e4cc23
Merge branch 'master' into improve-streaming-engines 2022-12-15 18:44:35 +01:00
Alexey Milovidov
127631ee47
Merge branch 'master' into perf_experiment 2022-11-12 18:58:25 +01:00
Kseniia Sumarokova
ed44b20694
Merge pull request #42224 from kssenii/fit-http-buffer-retries
Fix http buffer retries
2022-11-04 11:50:17 +01:00
Kruglov Pavel
b124875257
Merge branch 'master' into improve-streaming-engines 2022-11-03 13:22:06 +01:00
kssenii
4fe4a07600 Add test 2022-11-03 12:14:08 +01:00
avogar
8e13d1f1ec Improve and refactor Kafka/StorageMQ/NATS and data formats 2022-10-28 16:41:10 +00:00
Raúl Marín
6e0a9452e7 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-10-25 15:25:06 +02:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00