Commit Graph

98 Commits

Author SHA1 Message Date
zvonand
5153798aeb Introduced fileCluster table function
Added fileCluster function
Added test and docs
2023-11-22 15:06:04 +01:00
avogar
2d8f33bfa2 Fix parsing error in WithNames formats while reading subset of columns with disabled input_format_with_names_use_header 2023-09-11 14:55:37 +00:00
Antonio Andelic
f406019413 Apply PR comments 2023-08-30 09:26:01 +00:00
Antonio Andelic
9b99f25d75 Improve schema inference 2023-08-28 13:11:52 +00:00
Antonio Andelic
5a0c2ca108 Merge branch 'master' into archive-improvements-2 2023-08-28 08:34:42 +00:00
Antonio Andelic
8e1d38d377 Merge branch 'master' into archive-improvements-2 2023-08-24 13:03:36 +00:00
Kruglov Pavel
7e362a2110
Merge branch 'master' into fast-count-from-files 2023-08-23 15:13:20 +02:00
Antonio Andelic
83d4b819f3 Better support for reading from archives 2023-08-23 08:10:30 +00:00
avogar
47304bf7aa Optimize count from files in most input formats 2023-08-21 12:30:52 +00:00
avogar
4c32097df3 Use filter by file/path before reading in url/file/hdfs table functions, reduce code duplication 2023-08-17 16:54:43 +00:00
Antonio Andelic
d2b6646fc2 Merge branch 'master' into add-reading-from-archives 2023-08-04 12:42:46 +00:00
Antonio Andelic
9fb86f134b Fix tests 2023-07-31 12:04:27 +00:00
Antonio Andelic
e83e0ec2cd Fix build 2023-07-28 12:26:56 +00:00
Antonio Andelic
720d587e85 Merge branch 'master' into add-reading-from-archives 2023-07-28 08:49:00 +00:00
avogar
98aa6b317f Support reading subcolumns from file/s3/hdfs/url/azureBlobStorage table functions 2023-07-04 21:17:26 +00:00
Kruglov Pavel
873cee9451
Merge pull request #49626 from alekseygolub/renamefile
Added option to rename files, loaded via TableFunctionFile, after successful processing
2023-06-12 15:01:22 +02:00
Antonio Andelic
b11f744252
Correctly disable async insert with deduplication when it's not needed (#50663)
* Correctly disable async insert when it's not used

* Better

* Add comment

* Better

* Fix tests

---------

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-06-07 20:33:08 +02:00
nikitakeba
f604fb82b2
Merge branch 'master' into add-reading-from-archives-support 2023-05-29 23:34:19 +03:00
Nikita Keba
c18bff58b3 fix style 2023-05-29 20:08:18 +00:00
Nikita Keba
564691e25b add reading from archives 2023-05-25 00:00:32 +00:00
alekseygolub
c85c3afa1f Added option to rename files, loaded via TableFunctionFile, after success processing 2023-05-19 16:03:22 +00:00
Michael Kolupaev
3bd1489f18 Propagate input_format_parquet_preserve_order to parallelizeOutputAfterReading() 2023-05-05 04:20:27 +00:00
Michael Kolupaev
eb3b774ad0 Better control over Parquet row group size 2023-05-04 14:59:55 -07:00
Alexey Milovidov
9e1db557e0
Merge pull request #48612 from ClickHouse/remove-strange-code-2
Remove strange code
2023-04-11 06:17:45 +03:00
Alexey Milovidov
23a0879452 Remove strange code 2023-04-10 21:17:08 +02:00
Azat Khuzhin
79b83c4fd2 Remove superfluous includes of logger_userful.h from headers
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-10 17:59:30 +02:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
avogar
9b1a267203 Refactor, remove TTL, add size limit, add system table and system query 2022-08-05 16:20:15 +00:00
avogar
5155262a16 Add some additional information to cache keys 2022-06-27 12:43:24 +00:00
avogar
d37ad2e6de Implement cache for schema inference for file/s3/hdfs/url 2022-06-21 13:02:48 +00:00
avogar
a4cf07708c Fix comments 2022-05-20 14:57:27 +00:00
avogar
68bb07d166 Better naming 2022-05-13 18:39:19 +00:00
avogar
b17fec659a Improve performance and memory usage for select of subset of columns for some formats 2022-05-13 13:51:28 +00:00
Robert Schulze
777b5bc15b
Don't let storages inherit from boost::noncopyable
... IStorage has deleted copy ctor / assignment already
2022-05-03 09:07:08 +02:00
Robert Schulze
330212e0f4
Remove inherited create() method + disallow copying
The original motivation for this commit was that shared_ptr_helper used
std::shared_ptr<>() which does two heap allocations instead of
make_shared<>() which does a single allocation. Turned out that
1. the affected code (--> Storages/) is not on a hot path (rendering the
performance argument moot ...)
2. yet copying Storage objects is potentially dangerous and was
   previously allowed.

Hence, this change

- removes shared_ptr_helper and as a result all inherited create() methods,

- instead, Storage objects are now created using make_shared<>() by the
  caller (for that to work, many constructors had to be made public), and

- all Storage classes were marked as noncopyable using boost::noncopyable.

In sum, we are (likely) not making things faster but the code becomes
cleaner and harder to misuse.
2022-05-02 08:46:52 +02:00
Amos Bird
4a5e4274f0
base should not depend on Common 2022-04-29 10:26:35 +08:00
Anton Popov
df3b07fe7c Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-03 22:25:28 +00:00
Anton Popov
c1fdcf7a64 Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-01 20:21:39 +03:00
kssenii
092ec45b47 Merge master 2022-03-01 12:06:56 +01:00
Nikita Mikhaylov
d6036f6da3 Better
(cherry picked from commit 4ae445c9e227581ea9f1cbe9aa9d1ba82e1236c9)
2022-02-28 15:27:52 +00:00
kssenii
9b64a8fe39 Fix odbc bridge 2022-02-28 14:29:05 +01:00
Anton Popov
836a348a9c Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-01 15:23:07 +03:00
Kruglov Pavel
a9d0beb7ae
Fix data race in StorageFile (#34113)
* Fix data race in StorageFile

* Update StorageFile.h

* Fix
2022-01-31 11:58:40 +03:00
Anton Popov
78b9f15abb Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-30 03:24:37 +03:00
Kruglov Pavel
7873b4475f
Merge branch 'master' into autodetect-format 2022-01-25 10:56:52 +03:00
avogar
a6740d2f9a Detect format and schema for stdin in clickhouse-local 2022-01-25 10:25:37 +03:00
Anton Popov
e8ce091e68 Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-21 20:11:18 +03:00
avogar
97788b9c21 Allow to create new files on insert for File/S3/HDFS engines 2021-12-29 21:19:13 +03:00
avogar
8112a71233 Implement schema inference for most input formats 2021-12-29 12:18:56 +03:00
Anton Popov
a20922b2d3 Merge remote-tracking branch 'origin/sparse-serialization' into HEAD 2021-11-09 15:36:25 +03:00