Commit Graph

595 Commits

Author SHA1 Message Date
kssenii
3e42ee7f2b Get rid of finalize callback in object storages 2023-05-19 17:29:37 +02:00
Sema Checherinda
7fbf87be17 rework WriteBufferFromS3, squashed 2023-05-10 18:31:47 +00:00
Kseniia Sumarokova
336bb41c5d
Merge branch 'master' into remove-dependency-from-context 2023-05-08 12:46:10 +02:00
Michael Kolupaev
3bd1489f18 Propagate input_format_parquet_preserve_order to parallelizeOutputAfterReading() 2023-05-05 04:20:27 +00:00
Michael Kolupaev
eb3b774ad0 Better control over Parquet row group size 2023-05-04 14:59:55 -07:00
kssenii
1433f5ffc9 Merge remote-tracking branch 'upstream/master' into remove-dependency-from-context 2023-05-04 13:24:02 +02:00
Antonio Andelic
a68a023ca7
Merge pull request #48724 from johanngan/sse-kms
Support SSE-KMS configuration with S3 client
2023-05-04 13:20:54 +02:00
alesapin
412b161104
Merge pull request #48791 from kssenii/better-local-object-storage
Make local object storage work consistently with s3 object storage, fix problem with append, make it configurable as independent storage
2023-05-04 11:47:43 +02:00
johanngan
731823b873 Add support for SSE-KMS configuration with S3
https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html

Similar to the server_side_encryption_customer_key_base64 option for
configuring SSE-C with S3, add the following settings to configure
SSE-KMS on a per-endpoint/disk basis:
  - server_side_encryption_kms_key_id
  - server_side_encryption_kms_encryption_context
  - server_side_encryption_kms_bucket_key_enabled
2023-05-03 21:35:38 -05:00
Nikita Mikhaylov
954e3b724c
Speedup outdated parts loading (#49317) 2023-05-03 18:56:45 +02:00
kssenii
189f276ff5 Fix 2023-05-03 13:16:08 +02:00
kssenii
ecfbf1e304 Remove dependency from DB::Context in readers 2023-05-02 21:45:27 +02:00
Kseniia Sumarokova
45e2d296f9
Merge branch 'master' into better-local-object-storage 2023-04-27 14:54:04 +02:00
avogar
c503f6532c Add more finalize() to avoid terminate 2023-04-24 15:11:36 +00:00
Alexey Milovidov
67de39c2d9
Merge pull request #48727 from ClickHouse/parallel-processing-from-storages
Parallelize query processing right after reading FROM ...
2023-04-23 23:10:32 +03:00
kssenii
d8023806a9 Merge remote-tracking branch 'upstream/master' into better-local-object-storage 2023-04-23 12:39:34 +02:00
Kseniia Sumarokova
bd748045ad
Fix typo 2023-04-21 18:54:23 +02:00
Igor Nikonov
d5eb65b5ea Remove redundant narrowPipe() 2023-04-18 22:41:28 +00:00
kssenii
16b027ed0c Merge remote-tracking branch 'upstream/master' into better-local-object-storage 2023-04-18 16:25:08 +02:00
kssenii
b77e9c1ef0 Merge remote-tracking branch 'upstream/master' into better-local-object-storage 2023-04-17 16:44:10 +02:00
kssenii
d2c73a5522 Better 2023-04-17 16:41:21 +02:00
Michael Kolupaev
473f212c82 Hopefully fix assertion failure in CachedOnDiskReadBufferFromFile 2023-04-17 04:58:32 +00:00
Michael Kolupaev
87be78e6de Better 2023-04-17 04:58:32 +00:00
Michael Kolupaev
2d4fe85513 Something 2023-04-17 04:58:32 +00:00
kssenii
3fb4cd0f52 Fix s3 test 2023-04-05 14:13:46 +02:00
kssenii
a3d69694f4 Fix build 2023-04-04 23:13:17 +02:00
kssenii
f44c53b97a Merge remote-tracking branch 'upstream/master' into better-tests-for-data-lakes 2023-04-04 22:41:22 +02:00
Antonio Andelic
a329d80bfa
Merge pull request #47397 from ClickHouse/enable-env-credentials-default
Enable `use_environment_credentials` by default
2023-04-04 10:00:03 +02:00
kssenii
9b3d0ec86d Adjustments after conflicts 2023-04-03 19:53:34 +02:00
kssenii
8915f49b7d Merge remote-tracking branch 'upstream/master' into better-tests-for-data-lakes 2023-04-03 17:43:42 +02:00
kssenii
5578cb08ad Fix s3 cluster 2023-04-03 14:40:04 +02:00
Anton Popov
f715bd95f1 fix writing to StorageS3 2023-03-31 14:08:28 +00:00
Antonio Andelic
e982f2a67a Merge branch 'master' into enable-env-credentials-default 2023-03-31 09:11:01 +00:00
Anton Popov
5ceb855e7f
Merge branch 'master' into fix-race-storage-s3 2023-03-31 04:16:35 +02:00
Anton Popov
38389d878c fix one more race in StorageS3 2023-03-30 21:06:53 +00:00
kssenii
319417062f Merge remote-tracking branch 'upstream/master' into better-tests-for-data-lakes 2023-03-30 18:29:46 +02:00
Antonio Andelic
80cb121d2a
Merge pull request #48092 from ClickHouse/nosign-keyword-for-s3
Add support for `NOSIGN` keyword and `no_sign_request` config for S3
2023-03-30 18:10:56 +02:00
Anton Popov
e72472e71b
Merge branch 'master' into fix-race-storage-s3 2023-03-30 16:19:57 +02:00
kssenii
539414554f Fix s3 2023-03-30 15:32:38 +02:00
Antonio Andelic
9db58532f4
Clang-tidy fix 2023-03-30 08:41:14 +02:00
Anton Popov
ed29c141fb fix race in StorageS3 2023-03-29 22:13:45 +00:00
Antonio Andelic
7b1ad221b2 Address PR comments 2023-03-29 11:08:44 +00:00
Azat Khuzhin
f38a7aeabe ThreadPool metrics introspection
There are lots of thread pools and simple local-vs-global is not enough
already, it is good to know which one in particular uses threads.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-03-29 10:46:59 +02:00
kssenii
82b642c9c6 Fix style check 2023-03-28 21:57:14 +02:00
kssenii
13f29a7242 Better 2023-03-28 18:57:24 +02:00
Antonio Andelic
160aa186bb Add support for NOSIGN keyword and no_sign_request config 2023-03-28 07:05:35 +00:00
kssenii
36cc6fee51 Rewrite data lakes (part 1) 2023-03-24 22:35:12 +01:00
kssenii
cae3b335d6 Merge remote-tracking branch 'upstream/master' into named-collections-finish 2023-03-20 11:23:22 +01:00
Antonio Andelic
1b7401b58a
Update src/Storages/StorageS3.cpp
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2023-03-17 15:46:15 +01:00
Antonio Andelic
a170a909a4 Add expiration window for S3 credentials 2023-03-10 10:06:32 +00:00
Antonio Andelic
5bc21538e5 Enable use_environment_credentials by default 2023-03-09 10:31:55 +00:00
kssenii
8f2d75cef8 Fix tests 2023-03-05 12:56:00 +01:00
flynn
b3a9468661 fix 2023-02-17 12:42:24 +00:00
flynn
a39f6f419b refactor 2023-02-17 08:27:52 +00:00
flynn
ecc39978d7 fix conflict 2023-02-16 02:23:55 +00:00
Kruglov Pavel
4f380370a9
Fix s3Cluster schema inference in parallel distributed insert select (#46381)
* Fix s3Cluster schema inference in parallel distributed insert select
* Try fix flaky test
* Try SYSTEM SYNC REPLICA to avoid test flakiness
2023-02-15 15:30:43 +01:00
flynn
289c5c60d3 fix 2023-02-15 11:24:06 +00:00
Antonio Andelic
5ab24285fc Fix arg parsing 2023-02-14 08:33:59 +00:00
Antonio Andelic
3a6ea861d8 Extract common argument parsing logic 2023-02-13 12:27:49 +00:00
kssenii
3067c1d723 Merge remote-tracking branch 'upstream/master' into resubmit-prefetches 2023-02-11 11:36:23 +01:00
kssenii
b0b865c32e Resubmit prefetches 2023-02-08 21:26:24 +01:00
kssenii
9485873a2f Fix integration test 2023-02-07 12:45:23 +01:00
kssenii
ab0dedf0c8 Simplify code around storage s3 configuration 2023-02-06 16:23:17 +01:00
Kseniia Sumarokova
38c001ca42
Merge pull request #45957 from xiedeyantu/s3_file_not_found
Throw an error on no files satisfying S3 wildcard
2023-02-06 12:32:12 +01:00
xiedeyantu
f13eedd644 change settings name 2023-02-04 22:11:14 +08:00
Antonio Andelic
d5117f2aa6
Define S3 client with bucket and endpoint resolution (#45783)
* Update aws

* Define S3 client with bucket and endpoint resolution

* Add defines for ErrorCodes

* Use S3Client everywhere

* Remove unused errorcode

* Add DROP S3 CLIENT CACHE query

* Add a comment

* Fix style

* Update aws

* Update reference files

* Add missing include

* Fix unit test

* Remove unneeded declarations

* Correctly use RetryStrategy

* Rename S3Client to Client

* Fix retry count

* fix clang-tidy warnings
2023-02-03 14:30:52 +01:00
xiedeyantu
562642ab7f add settings s3_allow_throw_if_mismatch_files 2023-02-03 12:27:13 +08:00
xiedeyantu
e22cc0eb98 Throw an error on no files satisfying S3 wildcard 2023-02-02 19:13:34 +08:00
Vitaly Baranov
af74c008c4 Use one request to implement S3ObjectStorage::getObjectMetadata instead of two ones. 2023-01-27 18:42:37 +01:00
Vitaly Baranov
aea9ccdb60 Pass request settings to S3::getObjectInfo(). 2023-01-27 15:10:09 +01:00
Anton Popov
b58b73b0e7
Merge pull request #45529 from CurtizJ/fix-storage-s3-race
Try to fix test `test_storage_s3/test.py::test_wrong_s3_syntax` (race in `StorageS3`)
2023-01-26 14:21:32 +01:00
Anton Popov
8a1ea62aec fix race on cancelation of query in StorageS3 2023-01-24 01:12:01 +00:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
Anton Popov
41a199e175
Fix crash when ListObjects request fails (#45371) 2023-01-20 20:10:23 +01:00
Anton Popov
f40fd7a151
Add checks for compilation of regexps (#45356) 2023-01-17 23:46:04 +01:00
Vitaly Baranov
a955504043 Move throw_on_error parameter to the end. 2023-01-15 20:28:16 +01:00
Maksim Kita
4571c74fdd Fixed build 2023-01-10 16:49:55 +01:00
Anton Popov
79e89cf69c
Merge pull request #44939 from ClickHouse/revert-44493-s3_optimize
Revert "If user only need virtual columns, we don't need to initialize ReadBufferFromS3"
2023-01-10 10:42:18 +01:00
Anton Popov
4447afb14d
Revert "If user only need virtual columns, we don't need to initialize ReadBufferFromS3" 2023-01-05 16:38:20 +01:00
kssenii
67509aa2d5 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2023-01-03 16:41:30 +01:00
Antonio Andelic
e0b8fd528d
Merge pull request #44842 from ClickHouse/fix-data-race-storage-s3
Fix data race in StorageS3
2023-01-03 09:20:08 +01:00
Kruglov Pavel
966f57ef68
Merge pull request #42777 from Avogar/improve-streaming-engines
Refactor and Improve streaming engines Kafka/RabbitMQ/NATS and data formats
2023-01-02 15:59:06 +01:00
Antonio Andelic
e07d820156 Fix race on total_size 2023-01-02 14:28:27 +00:00
Kruglov Pavel
0844fe7089
Merge pull request #44493 from xiedeyantu/s3_optimize
If user only need virtual columns, we don't need to initialize ReadBufferFromS3
2022-12-30 15:44:20 +01:00
xiedeyantu
d6a92fbd63 better 2022-12-28 10:11:33 +08:00
xiedeyantu
2dd809e403 fix 2022-12-23 19:45:26 +08:00
xiedeyantu
68aeb39892 fix 2022-12-23 19:33:08 +08:00
xiedeyantu
d0eb22a1cd fix 2022-12-23 19:25:14 +08:00
xiedeyantu
b5fd23358f fixed 2022-12-23 19:11:51 +08:00
chen
5c8fb627b3
Update StorageS3.cpp 2022-12-22 08:28:09 +08:00
chen
9fc1020855
Update StorageS3.cpp 2022-12-22 08:15:50 +08:00
xiedeyantu
6a9ec7efb1 If user only need virtual columns, we don't need to initialize ReadBufferFromS3 2022-12-21 23:43:56 +08:00
Kruglov Pavel
5e01a3d74e
Merge branch 'master' into improve-streaming-engines 2022-12-21 10:51:50 +01:00
kssenii
1d75f740d7 Fix tests 2022-12-20 22:33:54 +01:00
kssenii
6bd4f8c029 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2022-12-20 21:17:28 +01:00
Raúl Marín
45d27f461b
Merge branch 'master' into perf_experiment 2022-12-20 09:07:48 +00:00
Nikolai Kochetov
f7c308077d Fixing build. 2022-12-17 17:12:04 +00:00
Nikolai Kochetov
b3278211bb Fixing tests. 2022-12-17 16:06:42 +00:00
Nikolai Kochetov
b2355a2212 Fixing tests. 2022-12-17 16:02:34 +00:00
Nikolai Kochetov
98ebef7914 Fixing special build. 2022-12-17 15:27:01 +00:00
Nikolai Kochetov
62ff98344e Validate s3 part upload settings. 2022-12-17 14:09:53 +00:00
kssenii
2ce5af421e Replace old named collections code for mongo 2022-12-17 00:50:25 +01:00
kssenii
30547d2dcd Replace old named collections code for url 2022-12-17 00:24:05 +01:00
Anton Popov
8b9b8b083c
Merge pull request #43726 from CurtizJ/optimize-storage-s3
Improve performance of storage `S3` with large number of small files
2022-12-16 14:38:10 +01:00
Kruglov Pavel
c5b2e4cc23
Merge branch 'master' into improve-streaming-engines 2022-12-15 18:44:35 +01:00
kssenii
d1b6ccd437 Fix test 2022-12-15 09:53:44 +01:00
Anton Popov
6bfe11e9b8 fix clang-tidy 2022-12-14 13:04:24 +00:00
kssenii
9aa6d31bce Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code 2022-12-13 22:25:10 +01:00
kssenii
fae817863c Apply new code of named collections to s3 2022-12-13 22:19:09 +01:00
Anton Popov
6cd606ffeb better saving of object info in iterator 2022-12-13 17:18:17 +00:00
Anton Popov
0c87031e80 Merge remote-tracking branch 'upstream/master' into HEAD 2022-12-13 16:33:21 +00:00
Nikolay Degterinsky
9b6d31b95d
Merge branch 'master' into perf_experiment 2022-12-13 17:15:07 +01:00
kssenii
611259bd54 Fix 2022-12-05 18:32:56 +01:00
kssenii
c7429d19e7 Merge remote-tracking branch 'upstream/master' into fix-progress-from-s3 2022-12-05 18:32:47 +01:00
Anton Popov
c2e92fd274 better code in StorageS3 2022-12-05 14:43:41 +00:00
Anton Popov
fe5fff0347
Merge pull request #43329 from xiedeyantu/support_nested_column
s3 table function can support select nested column using {column_name}.{subcolumn_name}
2022-11-29 22:27:19 +01:00
Anton Popov
2a1fd48e91 fix tests 2022-11-29 17:33:35 +00:00
Anton Popov
486da48ae7 fix tests 2022-11-28 21:15:41 +00:00
Anton Popov
e0dd533811 fix scheduling of async tasks in StorageS3 2022-11-28 16:13:01 +00:00
Anton Popov
5e8e1788ae fix files pruning in StorageS3 2022-11-28 13:56:25 +00:00
Anton Popov
65a78bcd91 improve performance of storage S3 2022-11-26 15:24:01 +00:00
xiedeyantu
304b6ebf3a s3 table function can support select nested column using {column_name}.{subcolumn_name} 2022-11-23 23:36:12 +08:00
kssenii
5e01441f61 Show progress bar while reading from s3 table function 2022-11-21 17:56:02 +01:00
Raúl Marín
ed0c174c0c Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-11-21 11:02:31 +01:00
Sergei Trifonov
f2f0676bcc
Revert "Revert "S3 request per second rate throttling"" 2022-11-17 17:35:04 +01:00
Raúl Marín
97d6fc3071 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-11-17 11:48:46 +01:00
Alexander Tokmakov
9011a18234
Revert "S3 request per second rate throttling" 2022-11-16 22:33:48 +03:00
Kseniia Sumarokova
59cf5def67
Merge branch 'master' into disk-s3-throttler 2022-11-15 12:13:37 +01:00
Alexey Milovidov
127631ee47
Merge branch 'master' into perf_experiment 2022-11-12 18:58:25 +01:00
zzsmdfj
3835373644 to add_oss_function_and_StorageOSS 2022-11-11 16:40:10 +08:00
Sergei Trifonov
8eedd1e046
Merge branch 'master' into disk-s3-throttler 2022-11-08 15:00:56 +01:00
serxa
6d5d9ff421 rename ReadWriteSettings -> RequestSettings 2022-11-08 13:48:23 +00:00
serxa
2daec0b45e S3 request per second rate throttling + refactoring 2022-11-07 18:05:40 +00:00
Kruglov Pavel
b124875257
Merge branch 'master' into improve-streaming-engines 2022-11-03 13:22:06 +01:00
Nikolay Degterinsky
30ad1a6826
Merge branch 'master' into perf_experiment 2022-11-03 02:18:21 +03:00
Vitaly Baranov
e0133688bc
Merge branch 'master' into mask-sensitive-info-in-logs 2022-11-02 16:26:13 +01:00
Kruglov Pavel
21d50f76ea
Merge pull request #41979 from Avogar/s3-cluster-schema-inference
Fix schema inference in s3Cluster and improve in hdfsCluster
2022-11-01 14:00:21 +01:00
Vitaly Baranov
5d2a222fe4 Mask sensitive information in logs. 2022-10-31 10:50:33 +01:00
avogar
8e13d1f1ec Improve and refactor Kafka/StorageMQ/NATS and data formats 2022-10-28 16:41:10 +00:00
Raúl Marín
891484b462 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-10-27 13:17:07 +02:00
Raúl Marín
6e0a9452e7 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-10-25 15:25:06 +02:00
SmitaRKulkarni
96c8260230
Merge branch 'master' into 36316_Support_glob_for_recursive_directory_traversal 2022-10-24 18:34:19 +02:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Raúl Marín
e60415d07d Make clang-tidy happy 2022-10-18 11:40:12 +02:00
Smita Kulkarni
91433e5b9c Added ** glob support for recursive directory traversal to filesystem and S3.
Implementation:
* Updated parseGlob to not add ‘/‘ restriction when ** is used.
* Updated S3 & filesystem to fetch files and not use regex match if glob is **.
Testing:
* Added a test for filesystem tests/queries/0_stateless/02459_glob_for_recursive_directory_traversal.sh
2022-10-17 09:04:25 +02:00
Kruglov Pavel
3d9f46a1e7
Merge branch 'master' into s3-cluster-schema-inference 2022-10-14 22:07:54 +02:00
avogar
c74b5c8126 Fix schema inference in s3Cluster and improve in hdfsCluster 2022-09-30 16:59:17 +00:00
Robert Schulze
78fc36ca49
Generate config.h into ${CONFIG_INCLUDE_PATH}
This makes the target location consistent with other auto-generated
files like config_formats.h, config_core.h, and config_functions.h and
simplifies the build of clickhouse_common.
2022-09-28 12:48:26 +00:00
kssenii
0a801dad2a Merge remote-tracking branch 'upstream/master' into fix-thread-status 2022-09-23 19:39:07 +02:00
kssenii
ab702e43fc Merge remote-tracking branch 'upstream/master' into fix-thread-status 2022-09-23 15:21:33 +02:00
kssenii
e34101456a Fix 2022-09-21 17:11:37 +02:00
kssenii
f917b268b3 Merge remote-tracking branch 'upstream/master' into s3-header-auth 2022-09-21 12:56:25 +02:00
serxa
2ef696ffe1 fix issues 2022-09-19 18:40:32 +00:00
serxa
f8aa738511 more conventional profile events names 2022-09-19 17:23:22 +00:00
mateng0915
f7f976e94e remove extra space 2022-09-14 19:19:47 +08:00
mateng0915
5badb1b186 resolve the review comments 2022-09-14 19:19:40 +08:00
kssenii
52ef3758c4 Merge remote-tracking branch 'upstream/master' into fix-thread-status 2022-09-13 16:34:31 +02:00
kssenii
420ac4eb43 s3 header auth in ast 2022-09-13 15:13:28 +02:00
kssenii
e51313b6b3 Get rid of static threadpools 2022-09-07 17:48:11 +02:00
Sergei Trifonov
bcb6475c4a add separate s3 profile events for disk s3 2022-09-01 18:30:55 +02:00
avogar
5ab87f1da4 Small refactoring 2022-08-19 16:42:23 +00:00
avogar
8dd54c043d Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-cache 2022-08-17 11:47:40 +00:00
avogar
c4ff3ffeea Rename settings 2022-08-15 12:45:18 +00:00
avogar
9b1a267203 Refactor, remove TTL, add size limit, add system table and system query 2022-08-05 16:20:15 +00:00
Kruglov Pavel
9252f42b4c
Merge branch 'master' into schema-inference-cache 2022-07-21 18:59:14 +02:00
avogar
6b541aa98f Fix WriteBuffer finalize when cancel insert into function 2022-07-21 12:18:37 +00:00
Kruglov Pavel
92995a832b
Revert "Fix WriteBuffer finalize in destructor when cacnel query" 2022-07-21 01:45:16 +02:00
Kruglov Pavel
3046cd6d29
Merge branch 'master' into schema-inference-cache 2022-07-20 13:30:42 +02:00
avogar
5c16d6b553 Fix WriteBuffer finalize in destructor when cacnel query 2022-07-19 19:21:30 +00:00
Kruglov Pavel
b38241b08a
Merge branch 'master' into schema-inference-cache 2022-07-14 12:29:54 +02:00
Sergei Trifonov
15ab3bc99f use context->getWriteSettings() 2022-07-13 19:48:57 +02:00
Sergei Trifonov
43779ec280 add max_remote_{read,write}_network_bandwidth_for_server settings 2022-07-11 14:59:39 +02:00
avogar
ee54c4f9b7 Add some fixes and add settings in docs 2022-06-30 12:41:56 +00:00
avogar
5155262a16 Add some additional information to cache keys 2022-06-27 12:43:24 +00:00
Kruglov Pavel
86e8f31ad4
Merge branch 'master' into schema-inference-cache 2022-06-24 16:10:25 +02:00
avogar
59c1c472cb Better exception messages on wrong table engines/functions argument types 2022-06-23 20:04:06 +00:00
Kruglov Pavel
e5a7f53775
Fix misleading error message while s3 schema inference 2022-06-22 12:36:09 +02:00
avogar
c14364e3d9 Check last modification time for URL function too 2022-06-21 17:18:14 +00:00
avogar
d37ad2e6de Implement cache for schema inference for file/s3/hdfs/url 2022-06-21 13:02:48 +00:00
Alexey Milovidov
73709b0488
Revert "Revert "Add a setting to use more memory for zstd decompression"" 2022-06-18 15:55:35 +03:00
alesapin
16e8b85fbf
Revert "Add a setting to use more memory for zstd decompression" 2022-06-18 14:08:14 +02:00
Alexey Milovidov
e20259e9ca
Merge pull request #37015 from wuxiaobai24/zstd_window_log_max
Add a setting to use more memory for zstd decompression
2022-06-18 04:19:27 +03:00
Nikolai Kochetov
8991f39412 Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-06-02 17:00:08 +00:00
Azat Khuzhin
545a56ce45 Fix sinks with onException() handler
It is possible to call onException() even after onFinish(), in case of
onFinish() throws, and in this case onException() should be no-op for
such sinks.

Also there can be caveats with PartitionedSync.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-01 21:50:30 +03:00
Azat Khuzhin
02af58f41d Fix possible "Cannot write to finalized buffer"
It is still possible to get this error since onException does not
finalize format correctly.

Here is an example of such error, that was found by CI [1]:

<details>

    [ 2686 ] {fa01bf02-73f6-4f7f-b14f-e725de6d7f9b} <Fatal> : Logical error: 'Cannot write to finalized buffer'.
    [ 34577 ] {} <Fatal> BaseDaemon: ########################################
    [ 34577 ] {} <Fatal> BaseDaemon: (version 22.6.1.1, build id: AB8040A6769E01A0) (from thread 2686) (query_id: fa01bf02-73f6-4f7f-b14f-e725de6d7f9b) (query: insert into test_02302 select number from numbers(10) settings s3_truncate_on_insert=1;) Received signal Aborted (6)
    [ 34577 ] {} <Fatal> BaseDaemon:
    [ 34577 ] {} <Fatal> BaseDaemon: Stack trace: 0x7fcbaa5a703b 0x7fcbaa586859 0xfad9bab 0xfad9e05 0xfaf6a3b 0x24a48c7f 0x258fb9b9 0x258f2004 0x258b88f4 0x258b863b 0x2581773d 0x258177ce 0x24bb5e98 0xfad01d6 0xfad0105 0x2419b11d 0xfad01d6 0xfad0105 0x2215afbb 0x2215aa48 0xfad01d6 0xfad0105 0xfcc265d 0x225cc546 0x249a1c40 0x249bc1b6 0x2685902c 0x26859505 0x269d7767 0x269d504c 0x7fcbaa75e609 0x7fcbaa683163
    [ 34577 ] {} <Fatal> BaseDaemon: 3. raise @ 0x7fcbaa5a703b in ?
    [ 34577 ] {} <Fatal> BaseDaemon: 4. abort @ 0x7fcbaa586859 in ?
    [ 34577 ] {} <Fatal> BaseDaemon: 5. ./build_docker/../src/Common/Exception.cpp:47: DB::abortOnFailedAssertion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xfad9bab in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 6. ./build_docker/../src/Common/Exception.cpp:70: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xfad9e05 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 7. ./build_docker/../src/IO/WriteBuffer.h:0: DB::WriteBuffer::write(char const*, unsigned long) @ 0xfaf6a3b in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 8. ./build_docker/../src/Processors/Formats/Impl/ArrowBufferedStreams.cpp:47: DB::ArrowBufferedOutputStream::Write(void const*, long) @ 0x24a48c7f in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 9. long parquet::ThriftSerializer::Serialize<parquet::format::FileMetaData>(parquet::format::FileMetaData const*, arrow::io::OutputStream*, std::__1::shared_ptr<parquet::Encryptor> const&) @ 0x258fb9b9 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 10. parquet::FileMetaData::FileMetaDataImpl::WriteTo(arrow::io::OutputStream*, std::__1::shared_ptr<parquet::Encryptor> const&) const @ 0x258f2004 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 11. parquet::WriteFileMetaData(parquet::FileMetaData const&, arrow::io::OutputStream*) @ 0x258b88f4 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 12. parquet::ParquetFileWriter::~ParquetFileWriter() @ 0x258b863b in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 13. parquet::arrow::FileWriterImpl::~FileWriterImpl() @ 0x2581773d in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 14. parquet::arrow::FileWriterImpl::~FileWriterImpl() @ 0x258177ce in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 15. ./build_docker/../src/Processors/Formats/Impl/ParquetBlockOutputFormat.h:27: DB::ParquetBlockOutputFormat::~ParquetBlockOutputFormat() @ 0x24bb5e98 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 16. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 17. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 18.1. inlined from ./build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:312: std::__1::unique_ptr<DB::WriteBuffer, std::__1::default_delete<DB::WriteBuffer> >::reset(DB::WriteBuffer*)
    [ 34577 ] {} <Fatal> BaseDaemon: 18.2. inlined from ../contrib/libcxx/include/__memory/unique_ptr.h:269: ~unique_ptr
    [ 34577 ] {} <Fatal> BaseDaemon: 18. ../src/Storages/StorageS3.cpp:566: DB::StorageS3Sink::~StorageS3Sink() @ 0x2419b11d in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 19. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 20. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 21. ./build_docker/../contrib/abseil-cpp/absl/container/internal/raw_hash_set.h:1662: absl::lts_20211102::container_internal::raw_hash_set<absl::lts_20211102::container_internal::FlatHashMapPolicy<StringRef, std::__1::shared_ptr<DB::SinkToStorage> >, absl::lts_20211102::hash_internal::Hash<StringRef>, std::__1::equal_to<StringRef>, std::__1::allocator<std::__1::pair<StringRef const, std::__1::shared_ptr<DB::SinkToStorage> > > >::destroy_slots() @ 0x2215afbb in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 22.1. inlined from ./build_docker/../contrib/libcxx/include/string:1445: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__is_long() const
    [ 34577 ] {} <Fatal> BaseDaemon: 22.2. inlined from ../contrib/libcxx/include/string:2231: ~basic_string
    [ 34577 ] {} <Fatal> BaseDaemon: 22. ../src/Storages/PartitionedSink.h:14: DB::PartitionedSink::~PartitionedSink() @ 0x2215aa48 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 23. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 24. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 25. ./build_docker/../contrib/libcxx/include/vector:802: std::__1::vector<std::__1::shared_ptr<DB::IProcessor>, std::__1::allocator<std::__1::shared_ptr<DB::IProcessor> > >::__base_destruct_at_end(std::__1::shared_ptr<DB::IProcessor>*) @ 0xfcc265d in /usr/bin/clickhouse
    [ 34577 ] {} <Fatal> BaseDaemon: 26.1. inlined from ./build_docker/../contrib/libcxx/include/vector:402: ~vector
    [ 34577 ] {} <Fatal> BaseDaemon: 26.2. inlined from ../src/QueryPipeline/QueryPipeline.cpp:29: ~QueryPipeline
    [ 34577 ] {} <Fatal> BaseDaemon: 26. ../src/QueryPipeline/QueryPipeline.cpp:535: DB::QueryPipeline::reset() @ 0x225cc546 in /usr/bin/clickhouse
    [ 614 ] {} <Fatal> Application: Child process was terminated by signal 6.

</details>

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/37542/8a224239c1d922158b4dc9f5d6609dca836dfd06/stress_test__undefined__actions_.html

Follow-up for: #36979

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-01 21:50:30 +03:00
Nikolai Kochetov
86fbb74703 Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-31 18:07:47 +00:00
Kruglov Pavel
0615866aea
Merge pull request #37450 from Avogar/check-format-on-storage-creation
Check format name on storage creation
2022-05-30 14:23:20 +02:00
Nikolai Kochetov
5b4658aa5e Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-30 09:47:35 +00:00
alesapin
c7b16065e1 Merge with master 2022-05-25 21:47:05 +02:00
alesapin
6f5c86e55e Merge branch 'master' into i_object_storage 2022-05-25 20:49:01 +02:00
Nikolai Kochetov
1b85f2c1d6 Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-25 16:27:40 +02:00
Kseniia Sumarokova
b50d4549c9
Merge pull request #37356 from amosbird/partition-prune-for-s3
"Partition pruning" for s3
2022-05-25 11:03:07 +02:00
avogar
f782fa31c6 Merge branch 'master' of github.com:ClickHouse/ClickHouse into check-format-on-storage-creation 2022-05-25 08:42:54 +00:00
Nikolai Kochetov
3d84aae0ab Better. 2022-05-24 20:06:08 +00:00
Amos Bird
093d315756
partition pruning for s3 2022-05-24 18:57:55 +08:00
mergify[bot]
51ff49a0ee
Merge branch 'master' into i_object_storage 2022-05-23 20:29:49 +00:00
avogar
37b66c8a9e Check format name on storage creation 2022-05-23 12:48:48 +00:00
Kruglov Pavel
f539fb835d
Merge branch 'master' into formats-with-names 2022-05-23 12:14:20 +02:00
Nikolai Kochetov
56feef01e7 Move some resources 2022-05-20 19:49:31 +00:00
avogar
2d4b4b9008 Fix inserting defaults for missing values in columnar formats 2022-05-16 14:19:44 +00:00