Kruglov Pavel
e4838725e3
Merge branch 'master' into allow-skip-empty-files
2023-06-12 20:03:23 +02:00
Kruglov Pavel
873cee9451
Merge pull request #49626 from alekseygolub/renamefile
...
Added option to rename files, loaded via TableFunctionFile, after successful processing
2023-06-12 15:01:22 +02:00
zvonand
3e6d393e17
remove debug cerr
2023-06-12 12:06:21 +02:00
zvonand
eb9cdbcf7d
fix File test being flaky
2023-06-12 11:41:36 +02:00
zvonand
2c97a94892
fix hdfs + style update
2023-06-11 01:50:17 +02:00
Kruglov Pavel
bf28074d32
Merge branch 'master' into allow-skip-empty-files
2023-06-08 12:36:18 +02:00
Antonio Andelic
b11f744252
Correctly disable async insert with deduplication when it's not needed ( #50663 )
...
* Correctly disable async insert when it's not used
* Better
* Add comment
* Better
* Fix tests
---------
Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-06-07 20:33:08 +02:00
Kruglov Pavel
b83b057045
Merge branch 'master' into renamefile
2023-06-06 19:38:05 +02:00
zvonand
1a361ef306
works for file
2023-06-05 03:21:43 +02:00
Kruglov Pavel
0beca0336d
Merge pull request #49112 from ClickHouse/Avogar-patch-3
...
Fix possible terminate called for uncaught exception in some places
2023-05-31 16:55:43 +02:00
avogar
d4efbbfbd3
Allow to skip empty files in file/s3/url/hdfs table functions
2023-05-30 19:32:24 +00:00
Kruglov Pavel
f863dee8e7
Merge branch 'master' into renamefile
2023-05-30 12:26:40 +02:00
nikitakeba
f604fb82b2
Merge branch 'master' into add-reading-from-archives-support
2023-05-29 23:34:19 +03:00
Nikita Keba
c18bff58b3
fix style
2023-05-29 20:08:18 +00:00
Nikita Keba
564691e25b
add reading from archives
2023-05-25 00:00:32 +00:00
alekseygolub
2b68a6a22a
Fix style
2023-05-19 16:03:22 +00:00
alekseygolub
c85c3afa1f
Added option to rename files, loaded via TableFunctionFile, after success processing
2023-05-19 16:03:22 +00:00
SmitaRKulkarni
a91c793684
Merge branch 'master' into 42192_Lower_parallel_parsing_threads_with_globs
2023-05-18 09:24:25 +02:00
Smita Kulkarni
fd58eac75a
Fixed max_threads datatype issue for builds
2023-05-13 10:22:37 +02:00
Smita Kulkarni
792565d858
Updated to ULL
2023-05-12 17:23:37 +02:00
Smita Kulkarni
ef1100bb90
Added include to fix build issue
2023-05-11 14:44:36 +02:00
SmitaRKulkarni
5c030c428c
Merge branch 'master' into 42192_Lower_parallel_parsing_threads_with_globs
2023-05-10 09:51:38 +02:00
Alexey Milovidov
a2c4b8e23d
Disable mmap for server
2023-05-10 03:16:52 +02:00
Michael Kolupaev
3bd1489f18
Propagate input_format_parquet_preserve_order to parallelizeOutputAfterReading()
2023-05-05 04:20:27 +00:00
Michael Kolupaev
eb3b774ad0
Better control over Parquet row group size
2023-05-04 14:59:55 -07:00
Smita Kulkarni
8205398f31
Fixed comment
2023-05-02 16:31:39 +02:00
Smita Kulkarni
a5d47ea489
Fixed build issues
2023-04-30 19:01:06 +02:00
Smita Kulkarni
307aa127d4
Updated to calculate and send max_parsing_threads
2023-04-25 13:27:20 +02:00
Smita Kulkarni
b70878aa0e
Merge branch 'master' into 42192_Lower_parallel_parsing_threads_with_globs
2023-04-24 19:39:36 +02:00
avogar
c503f6532c
Add more finalize() to avoid terminate
2023-04-24 15:11:36 +00:00
Igor Nikonov
8603807b57
Use generic way to parallelize output for file()
...
+ disable parallelization for storage Null
2023-04-15 12:35:24 +00:00
Igor Nikonov
1187534545
Simpler way to resize pipeline
2023-04-09 21:26:39 +00:00
Igor Nikonov
78038a3c2c
Fix: do not resize pipeline when there is no files to process (globs expands to empty set)
2023-04-07 11:34:04 +00:00
Igor Nikonov
96213fa464
Fix header
2023-04-06 22:17:09 +00:00
Igor Nikonov
2e139c21d2
Parallel reading in FROM file()
2023-04-06 21:57:03 +00:00
SmitaRKulkarni
d9c67a3380
Merge branch 'master' into 42192_Lower_parallel_parsing_threads_with_globs
2023-03-30 09:35:03 +02:00
Azat Khuzhin
33b13549ad
Separate out DirectoryMonitorSource as DistributedAsyncInsertSource
...
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-28 22:33:36 +01:00
Smita Kulkarni
3e9ab1276a
Updated to reduce parallel parsing based on number of files - When reading from multiple files reduce parallel parsing
2023-02-22 14:56:44 +01:00
Smita Kulkarni
3a6dea5e16
When reading from multiple files displace parallel parsing
...
Implementation:
* Added a new parameter to getInput & getInputFormat to disable parallel parsing.
* Currently this is used only by StorageFile as we have not seen degradation for other storages reading from multiple paths.
2023-02-21 17:03:00 +01:00
Sergei Trifonov
0d1ea05ff6
Merge pull request #45007 from ClickHouse/cancellable-mutex-integration
...
Fast shared mutex integration
2023-01-25 11:15:46 +01:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages ( #45449 )
...
* save format string for NetException
* format exceptions
* format exceptions 2
* format exceptions 3
* format exceptions 4
* format exceptions 5
* format exceptions 6
* fix
* format exceptions 7
* format exceptions 8
* Update MergeTreeIndexGin.cpp
* Update AggregateFunctionMap.cpp
* Update AggregateFunctionMap.cpp
* fix
2023-01-24 00:13:58 +03:00
Sergei Trifonov
0fbfa17863
Merge branch 'master' into cancellable-mutex-integration
2023-01-23 12:44:09 +01:00
Anton Popov
f40fd7a151
Add checks for compilation of regexps ( #45356 )
2023-01-17 23:46:04 +01:00
serxa
693489a8ad
review fixes
2023-01-12 15:51:04 +00:00
Kruglov Pavel
29240ef380
Merge pull request #43927 from pufit/mmap-for-storage-file
...
Added mmap for StorageFile
2023-01-11 21:25:02 +01:00
Kruglov Pavel
ce6962614d
Merge branch 'master' into mmap-for-storage-file
2023-01-10 17:34:01 +01:00
Maksim Kita
fbba28b31e
Analyzer aggregation without column fix
2023-01-10 16:49:55 +01:00
pufit
2d942af7b4
Fix codestyle, fix test.
2022-12-16 11:55:50 -05:00
pufit
b7df684762
Enum settings, fix else branch.
2022-12-15 18:08:19 -05:00
Kruglov Pavel
c5b2e4cc23
Merge branch 'master' into improve-streaming-engines
2022-12-15 18:44:35 +01:00
pufit
5c52f26823
ya fix.
2022-12-12 00:39:08 -05:00
pufit
6979dc9f2f
dummy fix, additional test
2022-12-11 17:36:30 -05:00
pufit
1d6e77a29a
Move reader selection logic back to StorageFile
.
2022-12-11 16:15:41 -05:00
pufit
e38a93c45a
Fix UB, fix test.
2022-12-10 22:26:07 -05:00
pufit
2d87cc1a6c
Add storage_file_read_method
setting.
2022-12-08 18:02:29 -05:00
pufit
76401ad0b9
Test and codestyle fix.
2022-12-07 23:17:10 -05:00
pufit
9b46baa17d
Rewrite StorageFile
buffer creation with createReadBufferFromFileBase
.
...
Add file descriptor support for `createReadBufferFromFileBase`.
Fix file_size overflow in `createReadBufferFromFileBase`.
Fix `MMapReadBufferFromFileWithCache` file_size definition.
2022-12-07 22:31:32 -05:00
pufit
084e465d84
Use mmap only on regular files.
2022-12-04 23:39:23 -05:00
pufit
bc7a76a486
Added mmap for StorageFile
2022-12-04 17:27:28 -05:00
kssenii
5e01441f61
Show progress bar while reading from s3 table function
2022-11-21 17:56:02 +01:00
Kruglov Pavel
b124875257
Merge branch 'master' into improve-streaming-engines
2022-11-03 13:22:06 +01:00
avogar
8e13d1f1ec
Improve and refactor Kafka/StorageMQ/NATS and data formats
2022-10-28 16:41:10 +00:00
SmitaRKulkarni
96c8260230
Merge branch 'master' into 36316_Support_glob_for_recursive_directory_traversal
2022-10-24 18:34:19 +02:00
Azat Khuzhin
4e76629aaf
Fixes for -Wshorten-64-to-32
...
- lots of static_cast
- add safe_cast
- types adjustments
- config
- IStorage::read/watch
- ...
- some TODO's (to convert types in future)
P.S. That was quite a journey...
v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Smita Kulkarni
91433e5b9c
Added ** glob support for recursive directory traversal to filesystem and S3.
...
Implementation:
* Updated parseGlob to not add ‘/‘ restriction when ** is used.
* Updated S3 & filesystem to fetch files and not use regex match if glob is **.
Testing:
* Added a test for filesystem tests/queries/0_stateless/02459_glob_for_recursive_directory_traversal.sh
2022-10-17 09:04:25 +02:00
Alexey Milovidov
ab4db2d0c4
Fix 5/6 of trash
2022-09-19 08:50:53 +02:00
avogar
5ab87f1da4
Small refactoring
2022-08-19 16:42:23 +00:00
avogar
8dd54c043d
Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-cache
2022-08-17 11:47:40 +00:00
avogar
c4ff3ffeea
Rename settings
2022-08-15 12:45:18 +00:00
Kseniia Sumarokova
ec4a4d31ea
Fix style check
2022-08-08 11:23:57 +02:00
Kseniia Sumarokova
895639644e
Update src/Storages/StorageFile.cpp
2022-08-07 14:17:42 +02:00
flynn
4fa1762f96
Merge branch 'master' into file
2022-08-07 14:22:08 +08:00
flynn
384a7ae901
Fix read of StorageFile with virtual columns
2022-08-06 17:29:33 +00:00
avogar
9b1a267203
Refactor, remove TTL, add size limit, add system table and system query
2022-08-05 16:20:15 +00:00
Kruglov Pavel
9252f42b4c
Merge branch 'master' into schema-inference-cache
2022-07-21 18:59:14 +02:00
avogar
6b541aa98f
Fix WriteBuffer finalize when cancel insert into function
2022-07-21 12:18:37 +00:00
Kruglov Pavel
92995a832b
Revert "Fix WriteBuffer finalize in destructor when cacnel query"
2022-07-21 01:45:16 +02:00
Kruglov Pavel
3046cd6d29
Merge branch 'master' into schema-inference-cache
2022-07-20 13:30:42 +02:00
avogar
5c16d6b553
Fix WriteBuffer finalize in destructor when cacnel query
2022-07-19 19:21:30 +00:00
Kruglov Pavel
b38241b08a
Merge branch 'master' into schema-inference-cache
2022-07-14 12:29:54 +02:00
avogar
106f92dcdb
Fix tests
2022-06-28 16:13:42 +00:00
avogar
5155262a16
Add some additional information to cache keys
2022-06-27 12:43:24 +00:00
Kruglov Pavel
86e8f31ad4
Merge branch 'master' into schema-inference-cache
2022-06-24 16:10:25 +02:00
avogar
59c1c472cb
Better exception messages on wrong table engines/functions argument types
2022-06-23 20:04:06 +00:00
avogar
c14364e3d9
Check last modification time for URL function too
2022-06-21 17:18:14 +00:00
avogar
d37ad2e6de
Implement cache for schema inference for file/s3/hdfs/url
2022-06-21 13:02:48 +00:00
Alexey Milovidov
73709b0488
Revert "Revert "Add a setting to use more memory for zstd decompression""
2022-06-18 15:55:35 +03:00
alesapin
16e8b85fbf
Revert "Add a setting to use more memory for zstd decompression"
2022-06-18 14:08:14 +02:00
Alexey Milovidov
e20259e9ca
Merge pull request #37015 from wuxiaobai24/zstd_window_log_max
...
Add a setting to use more memory for zstd decompression
2022-06-18 04:19:27 +03:00
Nikolai Kochetov
8991f39412
Merge branch 'master' into refactor-read-metrics-and-callbacks
2022-06-02 17:00:08 +00:00
Azat Khuzhin
545a56ce45
Fix sinks with onException() handler
...
It is possible to call onException() even after onFinish(), in case of
onFinish() throws, and in this case onException() should be no-op for
such sinks.
Also there can be caveats with PartitionedSync.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-01 21:50:30 +03:00
Azat Khuzhin
02af58f41d
Fix possible "Cannot write to finalized buffer"
...
It is still possible to get this error since onException does not
finalize format correctly.
Here is an example of such error, that was found by CI [1]:
<details>
[ 2686 ] {fa01bf02-73f6-4f7f-b14f-e725de6d7f9b} <Fatal> : Logical error: 'Cannot write to finalized buffer'.
[ 34577 ] {} <Fatal> BaseDaemon: ########################################
[ 34577 ] {} <Fatal> BaseDaemon: (version 22.6.1.1, build id: AB8040A6769E01A0) (from thread 2686) (query_id: fa01bf02-73f6-4f7f-b14f-e725de6d7f9b) (query: insert into test_02302 select number from numbers(10) settings s3_truncate_on_insert=1;) Received signal Aborted (6)
[ 34577 ] {} <Fatal> BaseDaemon:
[ 34577 ] {} <Fatal> BaseDaemon: Stack trace: 0x7fcbaa5a703b 0x7fcbaa586859 0xfad9bab 0xfad9e05 0xfaf6a3b 0x24a48c7f 0x258fb9b9 0x258f2004 0x258b88f4 0x258b863b 0x2581773d 0x258177ce 0x24bb5e98 0xfad01d6 0xfad0105 0x2419b11d 0xfad01d6 0xfad0105 0x2215afbb 0x2215aa48 0xfad01d6 0xfad0105 0xfcc265d 0x225cc546 0x249a1c40 0x249bc1b6 0x2685902c 0x26859505 0x269d7767 0x269d504c 0x7fcbaa75e609 0x7fcbaa683163
[ 34577 ] {} <Fatal> BaseDaemon: 3. raise @ 0x7fcbaa5a703b in ?
[ 34577 ] {} <Fatal> BaseDaemon: 4. abort @ 0x7fcbaa586859 in ?
[ 34577 ] {} <Fatal> BaseDaemon: 5. ./build_docker/../src/Common/Exception.cpp:47: DB::abortOnFailedAssertion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xfad9bab in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 6. ./build_docker/../src/Common/Exception.cpp:70: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xfad9e05 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 7. ./build_docker/../src/IO/WriteBuffer.h:0: DB::WriteBuffer::write(char const*, unsigned long) @ 0xfaf6a3b in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 8. ./build_docker/../src/Processors/Formats/Impl/ArrowBufferedStreams.cpp:47: DB::ArrowBufferedOutputStream::Write(void const*, long) @ 0x24a48c7f in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 9. long parquet::ThriftSerializer::Serialize<parquet::format::FileMetaData>(parquet::format::FileMetaData const*, arrow::io::OutputStream*, std::__1::shared_ptr<parquet::Encryptor> const&) @ 0x258fb9b9 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 10. parquet::FileMetaData::FileMetaDataImpl::WriteTo(arrow::io::OutputStream*, std::__1::shared_ptr<parquet::Encryptor> const&) const @ 0x258f2004 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 11. parquet::WriteFileMetaData(parquet::FileMetaData const&, arrow::io::OutputStream*) @ 0x258b88f4 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 12. parquet::ParquetFileWriter::~ParquetFileWriter() @ 0x258b863b in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 13. parquet::arrow::FileWriterImpl::~FileWriterImpl() @ 0x2581773d in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 14. parquet::arrow::FileWriterImpl::~FileWriterImpl() @ 0x258177ce in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 15. ./build_docker/../src/Processors/Formats/Impl/ParquetBlockOutputFormat.h:27: DB::ParquetBlockOutputFormat::~ParquetBlockOutputFormat() @ 0x24bb5e98 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 16. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 17. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 18.1. inlined from ./build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:312: std::__1::unique_ptr<DB::WriteBuffer, std::__1::default_delete<DB::WriteBuffer> >::reset(DB::WriteBuffer*)
[ 34577 ] {} <Fatal> BaseDaemon: 18.2. inlined from ../contrib/libcxx/include/__memory/unique_ptr.h:269: ~unique_ptr
[ 34577 ] {} <Fatal> BaseDaemon: 18. ../src/Storages/StorageS3.cpp:566: DB::StorageS3Sink::~StorageS3Sink() @ 0x2419b11d in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 19. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 20. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 21. ./build_docker/../contrib/abseil-cpp/absl/container/internal/raw_hash_set.h:1662: absl::lts_20211102::container_internal::raw_hash_set<absl::lts_20211102::container_internal::FlatHashMapPolicy<StringRef, std::__1::shared_ptr<DB::SinkToStorage> >, absl::lts_20211102::hash_internal::Hash<StringRef>, std::__1::equal_to<StringRef>, std::__1::allocator<std::__1::pair<StringRef const, std::__1::shared_ptr<DB::SinkToStorage> > > >::destroy_slots() @ 0x2215afbb in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 22.1. inlined from ./build_docker/../contrib/libcxx/include/string:1445: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__is_long() const
[ 34577 ] {} <Fatal> BaseDaemon: 22.2. inlined from ../contrib/libcxx/include/string:2231: ~basic_string
[ 34577 ] {} <Fatal> BaseDaemon: 22. ../src/Storages/PartitionedSink.h:14: DB::PartitionedSink::~PartitionedSink() @ 0x2215aa48 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 23. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:173: std::__1::__shared_count::__release_shared() @ 0xfad01d6 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 24. ./build_docker/../contrib/libcxx/include/__memory/shared_ptr.h:216: std::__1::__shared_weak_count::__release_shared() @ 0xfad0105 in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 25. ./build_docker/../contrib/libcxx/include/vector:802: std::__1::vector<std::__1::shared_ptr<DB::IProcessor>, std::__1::allocator<std::__1::shared_ptr<DB::IProcessor> > >::__base_destruct_at_end(std::__1::shared_ptr<DB::IProcessor>*) @ 0xfcc265d in /usr/bin/clickhouse
[ 34577 ] {} <Fatal> BaseDaemon: 26.1. inlined from ./build_docker/../contrib/libcxx/include/vector:402: ~vector
[ 34577 ] {} <Fatal> BaseDaemon: 26.2. inlined from ../src/QueryPipeline/QueryPipeline.cpp:29: ~QueryPipeline
[ 34577 ] {} <Fatal> BaseDaemon: 26. ../src/QueryPipeline/QueryPipeline.cpp:535: DB::QueryPipeline::reset() @ 0x225cc546 in /usr/bin/clickhouse
[ 614 ] {} <Fatal> Application: Child process was terminated by signal 6.
</details>
[1]: https://s3.amazonaws.com/clickhouse-test-reports/37542/8a224239c1d922158b4dc9f5d6609dca836dfd06/stress_test__undefined__actions_.html
Follow-up for: #36979
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-01 21:50:30 +03:00
Nikolai Kochetov
86fbb74703
Merge branch 'master' into refactor-read-metrics-and-callbacks
2022-05-31 18:07:47 +00:00
Nikolai Kochetov
1b85f2c1d6
Merge branch 'master' into refactor-read-metrics-and-callbacks
2022-05-25 16:27:40 +02:00
avogar
f782fa31c6
Merge branch 'master' of github.com:ClickHouse/ClickHouse into check-format-on-storage-creation
2022-05-25 08:42:54 +00:00
Nikolai Kochetov
3d84aae0ab
Better.
2022-05-24 20:06:08 +00:00
avogar
37b66c8a9e
Check format name on storage creation
2022-05-23 12:48:48 +00:00
Kruglov Pavel
f539fb835d
Merge branch 'master' into formats-with-names
2022-05-23 12:14:20 +02:00
Nikolai Kochetov
56feef01e7
Move some resources
2022-05-20 19:49:31 +00:00
avogar
2d4b4b9008
Fix inserting defaults for missing values in columnar formats
2022-05-16 14:19:44 +00:00