Kruglov Pavel
46a6b84a5a
Merge branch 'master' into auto-format-detection
2024-01-25 22:11:07 +01:00
Maksim Kita
2a327107b6
Updated implementation
2024-01-25 14:31:49 +03:00
avogar
617cc514b7
Try to detect file format automatically during schema inference if it's unknown
2024-01-23 18:59:39 +00:00
Nikolai Kochetov
8936c8376a
Use predicate in getTaskIteratorExtension.
2024-01-02 17:14:16 +00:00
avogar
007353a2dd
Add _size virtual column to s3/file/hdfs/url/azureBlobStorage engines
2023-11-22 18:12:36 +00:00
kssenii
d644992192
Fxi
2023-09-28 16:25:04 +02:00
kssenii
1749874e7b
Fxi
2023-09-28 13:51:07 +02:00
kssenii
6b191a1afe
Better
2023-09-27 14:54:31 +02:00
avogar
4c32097df3
Use filter by file/path before reading in url/file/hdfs table functions, reduce code duplication
2023-08-17 16:54:43 +00:00
Kruglov Pavel
fec5675cd4
Merge branch 'master' into better-progress-bar-2
2023-07-24 19:59:38 +02:00
avogar
cf082f2f9a
Use read_bytes/total_bytes_to_read for progress bar in s3/file/url/... table functions
2023-06-22 17:24:43 +00:00
Nikolay Degterinsky
9a25958be8
Add HTTP header filtering
2023-06-15 13:49:49 +00:00
avogar
334f062fa0
fix style
2023-05-15 16:39:26 +00:00
avogar
70a8fd2c50
Fix schema inference with named collection, refactor Cluster table functions
2023-05-12 13:58:45 +00:00
avogar
2949ceced1
Fix adding structure to cluster table functions, make it better
2023-04-24 13:20:04 +00:00
avogar
447189a6ca
Better
2023-04-21 17:54:09 +00:00
avogar
944f54aadf
Finish urlCluster, refactor code, reduce code duplication
2023-04-21 17:24:37 +00:00
Kruglov Pavel
2ad161d2b7
Merge branch 'master' into non-blocking-connect
2023-04-19 13:39:40 +02:00
kssenii
13f29a7242
Better
2023-03-28 18:57:24 +02:00
kssenii
36cc6fee51
Rewrite data lakes (part 1)
2023-03-24 22:35:12 +01:00
Kruglov Pavel
f3f93dd06c
Merge branch 'master' into non-blocking-connect
2023-03-24 15:59:40 +01:00
Amos Bird
02c5d1f364
Correct exact_rows_before_limit in all senarios
2023-03-22 23:26:31 +08:00
avogar
38e44861ae
Fix possible race conditions
2023-03-21 16:01:54 +00:00
Alexander Tokmakov
ed08f8f5c5
Merge branch 'master' into revert_25674
2023-03-12 02:33:25 +03:00
Alexander Tokmakov
7b1b238d0b
Revert "Merge pull request #25674 from amosbird/distributedreturnconnection"
...
This reverts commit 5ffd99dfd4
, reversing
changes made to 2796aa333f
.
2023-03-11 19:09:47 +01:00
Maksim Kita
c835fa3958
Fixed tests
2023-03-11 11:51:54 +01:00
Maksim Kita
0358cb36d8
Fixed tests
2023-03-11 11:51:54 +01:00
flynn
b3a9468661
fix
2023-02-17 12:42:24 +00:00
Kruglov Pavel
4f380370a9
Fix s3Cluster schema inference in parallel distributed insert select ( #46381 )
...
* Fix s3Cluster schema inference in parallel distributed insert select
* Try fix flaky test
* Try SYSTEM SYNC REPLICA to avoid test flakiness
2023-02-15 15:30:43 +01:00
Robert Schulze
6ff232d782
Merge branch 'master' into rs/fix-fragile-linking
2023-02-08 12:51:12 +01:00
kssenii
ab0dedf0c8
Simplify code around storage s3 configuration
2023-02-06 16:23:17 +01:00
Robert Schulze
84b9ff450f
Fix terribly broken, fragile and potentially cyclic linking
...
Sorry for the clickbaity title. This is about static method
ConnectionTimeouts::getHTTPTimeouts(). It was be declared in header
IO/ConnectionTimeouts.h, and defined in header
IO/ConnectionTimeoutsContext.h (!). This is weird and caused issues with
linking on s390x (##45520). There was an attempt to fix some
inconsistencies (#45848 ) but neither did @Algunenano nor me at first
really understand why the definition is in the header.
Turns out that ConnectionTimeoutsContext.h is only #include'd from
source files which are part of the normal server build BUT NOT part of
the keeper standalone build (which must be enabled via CMake
-DBUILD_STANDALONE_KEEPER=1). This dependency was not documented and as
a result, some misguided workarounds were introduced earlier, e.g.
0341c6c54b
The deeper cause was that getHTTPTimeouts() is passed a "Context". This
class is part of the "dbms" libary which is deliberately not linked by
the standalone build of clickhouse-keeper. The context is only used to
read the settings and the "Settings" class is part of the
clickhouse_common library which is linked by clickhouse-keeper already.
To resolve this mess, this PR
- creates source file IO/ConnectionTimeouts.cpp and moves all
ConnectionTimeouts definitions into it, including getHTTPTimeouts().
- breaks the wrong dependency by passing "Settings" instead of "Context"
into getHTTPTimeouts().
- resolves the previous hacks
2023-02-05 20:49:34 +00:00
Antonio Andelic
d5117f2aa6
Define S3 client with bucket and endpoint resolution ( #45783 )
...
* Update aws
* Define S3 client with bucket and endpoint resolution
* Add defines for ErrorCodes
* Use S3Client everywhere
* Remove unused errorcode
* Add DROP S3 CLIENT CACHE query
* Add a comment
* Fix style
* Update aws
* Update reference files
* Add missing include
* Fix unit test
* Remove unneeded declarations
* Correctly use RetryStrategy
* Rename S3Client to Client
* Fix retry count
* fix clang-tidy warnings
2023-02-03 14:30:52 +01:00
Raúl Marín
7c31cb7adc
Proper includes for ConnectionTimeoutsContext.h
2023-01-31 16:11:32 +01:00
avogar
117ec13c9e
Fix s3Cluster schema inference when structure from insertion table is used
2023-01-18 20:33:50 +00:00
Nikita Mikhaylov
857799fbca
Parallel distributed insert select with s3Cluster [3] ( #44955 )
...
* Revert "Revert "Resurrect parallel distributed insert select with s3Cluster (#41535 )""
This reverts commit b8d9066004
.
* Fix build
* Better
* Fix test
* Automatic style fix
Co-authored-by: robot-clickhouse <robot-clickhouse@users.noreply.github.com>
2023-01-09 13:30:32 +01:00
Anton Popov
6cd606ffeb
better saving of object info in iterator
2022-12-13 17:18:17 +00:00
Anton Popov
0c87031e80
Merge remote-tracking branch 'upstream/master' into HEAD
2022-12-13 16:33:21 +00:00
kssenii
88523ef0b6
Fix
2022-12-07 11:22:48 +01:00
kssenii
c7429d19e7
Merge remote-tracking branch 'upstream/master' into fix-progress-from-s3
2022-12-05 18:32:47 +01:00
chen
b6eddbac0d
fix s3Cluster function returns NOT_FOUND_COLUMN_IN_BLOCK error ( #43629 )
...
* fix s3Cluster function returns NOT_FOUND_COLUMN_IN_BLOCK error
* Update StorageS3Cluster.cpp
* Update 01801_s3_cluster_count.sql
* fix
2022-12-02 15:43:29 +01:00
Anton Popov
65a78bcd91
improve performance of storage S3
2022-11-26 15:24:01 +00:00
kssenii
5e01441f61
Show progress bar while reading from s3 table function
2022-11-21 17:56:02 +01:00
Sergei Trifonov
f2f0676bcc
Revert "Revert "S3 request per second rate throttling""
2022-11-17 17:35:04 +01:00
Alexander Tokmakov
9011a18234
Revert "S3 request per second rate throttling"
2022-11-16 22:33:48 +03:00
Kseniia Sumarokova
59cf5def67
Merge branch 'master' into disk-s3-throttler
2022-11-15 12:13:37 +01:00
xiedeyantu
5504f3af9b
fix skip_unavailable_shards does not work using s3Cluster table function
2022-11-12 00:03:36 +08:00
serxa
6d5d9ff421
rename ReadWriteSettings -> RequestSettings
2022-11-08 13:48:23 +00:00
Kruglov Pavel
21d50f76ea
Merge pull request #41979 from Avogar/s3-cluster-schema-inference
...
Fix schema inference in s3Cluster and improve in hdfsCluster
2022-11-01 14:00:21 +01:00
Azat Khuzhin
4e76629aaf
Fixes for -Wshorten-64-to-32
...
- lots of static_cast
- add safe_cast
- types adjustments
- config
- IStorage::read/watch
- ...
- some TODO's (to convert types in future)
P.S. That was quite a journey...
v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00