Commit Graph

110 Commits

Author SHA1 Message Date
Kruglov Pavel
46a6b84a5a
Merge branch 'master' into auto-format-detection 2024-01-25 22:11:07 +01:00
Maksim Kita
2a327107b6 Updated implementation 2024-01-25 14:31:49 +03:00
avogar
617cc514b7 Try to detect file format automatically during schema inference if it's unknown 2024-01-23 18:59:39 +00:00
Nikolai Kochetov
eff6232418 Merge branch 'master' into try-to-remove-pk-analysis-on-ast 2024-01-05 10:54:46 +00:00
Nikolai Kochetov
7a271f09ed Check if I can remove KeyCondition analysis on AST. 2024-01-03 17:50:46 +00:00
Nikolai Kochetov
c808b03e55 Remove unneeded code 2024-01-02 17:27:33 +00:00
Nikolai Kochetov
5521e5d9b1 Refactor StorageHDFS and StorageFile virtual columns filtering 2023-12-29 15:58:01 +00:00
avogar
ee7af95bc0 Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-union 2023-12-08 20:29:28 +00:00
zvonand
c329e382ab resolve conflicts 2023-11-29 16:03:07 +01:00
avogar
4d9a1b50f9 Add information about new _size virtual column in file/s3/url/hdfs/azure table functions 2023-11-28 18:15:07 +00:00
zvonand
c306d21b54 merge master + resolve conflicts 2023-11-28 15:51:21 +01:00
zvonand
5153798aeb Introduced fileCluster table function
Added fileCluster function
Added test and docs
2023-11-22 15:06:04 +01:00
avogar
6934e27e8b Add union mode for schema inference to infer union schema of files with different schemas 2023-10-20 20:46:41 +00:00
avogar
2d8f33bfa2 Fix parsing error in WithNames formats while reading subset of columns with disabled input_format_with_names_use_header 2023-09-11 14:55:37 +00:00
Antonio Andelic
f406019413 Apply PR comments 2023-08-30 09:26:01 +00:00
Antonio Andelic
9b99f25d75 Improve schema inference 2023-08-28 13:11:52 +00:00
Antonio Andelic
5a0c2ca108 Merge branch 'master' into archive-improvements-2 2023-08-28 08:34:42 +00:00
Antonio Andelic
8e1d38d377 Merge branch 'master' into archive-improvements-2 2023-08-24 13:03:36 +00:00
Kruglov Pavel
7e362a2110
Merge branch 'master' into fast-count-from-files 2023-08-23 15:13:20 +02:00
Antonio Andelic
83d4b819f3 Better support for reading from archives 2023-08-23 08:10:30 +00:00
avogar
47304bf7aa Optimize count from files in most input formats 2023-08-21 12:30:52 +00:00
avogar
4c32097df3 Use filter by file/path before reading in url/file/hdfs table functions, reduce code duplication 2023-08-17 16:54:43 +00:00
Antonio Andelic
d2b6646fc2 Merge branch 'master' into add-reading-from-archives 2023-08-04 12:42:46 +00:00
Antonio Andelic
9fb86f134b Fix tests 2023-07-31 12:04:27 +00:00
Antonio Andelic
e83e0ec2cd Fix build 2023-07-28 12:26:56 +00:00
Antonio Andelic
720d587e85 Merge branch 'master' into add-reading-from-archives 2023-07-28 08:49:00 +00:00
avogar
98aa6b317f Support reading subcolumns from file/s3/hdfs/url/azureBlobStorage table functions 2023-07-04 21:17:26 +00:00
Kruglov Pavel
873cee9451
Merge pull request #49626 from alekseygolub/renamefile
Added option to rename files, loaded via TableFunctionFile, after successful processing
2023-06-12 15:01:22 +02:00
Antonio Andelic
b11f744252
Correctly disable async insert with deduplication when it's not needed (#50663)
* Correctly disable async insert when it's not used

* Better

* Add comment

* Better

* Fix tests

---------

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-06-07 20:33:08 +02:00
nikitakeba
f604fb82b2
Merge branch 'master' into add-reading-from-archives-support 2023-05-29 23:34:19 +03:00
Nikita Keba
c18bff58b3 fix style 2023-05-29 20:08:18 +00:00
Nikita Keba
564691e25b add reading from archives 2023-05-25 00:00:32 +00:00
alekseygolub
c85c3afa1f Added option to rename files, loaded via TableFunctionFile, after success processing 2023-05-19 16:03:22 +00:00
Michael Kolupaev
3bd1489f18 Propagate input_format_parquet_preserve_order to parallelizeOutputAfterReading() 2023-05-05 04:20:27 +00:00
Michael Kolupaev
eb3b774ad0 Better control over Parquet row group size 2023-05-04 14:59:55 -07:00
Alexey Milovidov
9e1db557e0
Merge pull request #48612 from ClickHouse/remove-strange-code-2
Remove strange code
2023-04-11 06:17:45 +03:00
Alexey Milovidov
23a0879452 Remove strange code 2023-04-10 21:17:08 +02:00
Azat Khuzhin
79b83c4fd2 Remove superfluous includes of logger_userful.h from headers
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-10 17:59:30 +02:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
avogar
9b1a267203 Refactor, remove TTL, add size limit, add system table and system query 2022-08-05 16:20:15 +00:00
avogar
5155262a16 Add some additional information to cache keys 2022-06-27 12:43:24 +00:00
avogar
d37ad2e6de Implement cache for schema inference for file/s3/hdfs/url 2022-06-21 13:02:48 +00:00
avogar
a4cf07708c Fix comments 2022-05-20 14:57:27 +00:00
avogar
68bb07d166 Better naming 2022-05-13 18:39:19 +00:00
avogar
b17fec659a Improve performance and memory usage for select of subset of columns for some formats 2022-05-13 13:51:28 +00:00
Robert Schulze
777b5bc15b
Don't let storages inherit from boost::noncopyable
... IStorage has deleted copy ctor / assignment already
2022-05-03 09:07:08 +02:00
Robert Schulze
330212e0f4
Remove inherited create() method + disallow copying
The original motivation for this commit was that shared_ptr_helper used
std::shared_ptr<>() which does two heap allocations instead of
make_shared<>() which does a single allocation. Turned out that
1. the affected code (--> Storages/) is not on a hot path (rendering the
performance argument moot ...)
2. yet copying Storage objects is potentially dangerous and was
   previously allowed.

Hence, this change

- removes shared_ptr_helper and as a result all inherited create() methods,

- instead, Storage objects are now created using make_shared<>() by the
  caller (for that to work, many constructors had to be made public), and

- all Storage classes were marked as noncopyable using boost::noncopyable.

In sum, we are (likely) not making things faster but the code becomes
cleaner and harder to misuse.
2022-05-02 08:46:52 +02:00
Amos Bird
4a5e4274f0
base should not depend on Common 2022-04-29 10:26:35 +08:00
Anton Popov
df3b07fe7c Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-03 22:25:28 +00:00
Anton Popov
c1fdcf7a64 Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-01 20:21:39 +03:00