Kruglov Pavel
53159db782
Merge branch 'master' into numbers-schema-inference
2022-07-26 12:32:49 +02:00
Kruglov Pavel
83c7da6e88
Merge branch 'master' into fix-protobuf-capnp-empty-message
2022-07-25 13:02:41 +02:00
Alexey Milovidov
388d06fda1
Merge pull request #39535 from ClickHouse/stringref
...
Less usage of StringRef
2022-07-25 04:06:11 +03:00
Robert Schulze
4333750985
Less usage of StringRef
...
... replaced by std::string_view, see #39262
2022-07-24 18:33:52 +00:00
Alexander Tokmakov
bed2206ae9
Merge pull request #39460 from ClickHouse/remove_some_dead_and_commented_code
...
Remove some dead and commented code
2022-07-22 13:24:34 +03:00
avogar
794aa691bc
Merge branch 'master' of github.com:ClickHouse/ClickHouse into fix-protobuf-capnp-empty-message
2022-07-21 17:04:37 +00:00
Kruglov Pavel
9252f42b4c
Merge branch 'master' into schema-inference-cache
2022-07-21 18:59:14 +02:00
avogar
fd534aa3fa
wqMerge branch 'master' of github.com:ClickHouse/ClickHouse into numbers-schema-inference
2022-07-21 15:43:17 +00:00
Alexander Tokmakov
a8da5d96fc
remove some dead and commented code
2022-07-21 15:05:48 +02:00
Nikolai Kochetov
e15967e9db
Merge pull request #38475 from ClickHouse/additional-filters
...
Additional filters for a table (from setting)
2022-07-21 07:52:04 +02:00
Alexey Milovidov
dcda9d3bd1
Merge pull request #39365 from Avogar/fix-capnproto-abort
...
Avoid possible abort() in CapnProto on exception descruction
2022-07-21 05:20:45 +03:00
Nikolai Kochetov
91043351aa
Fixing build.
2022-07-20 20:30:16 +00:00
Kruglov Pavel
46da17ca8c
Merge branch 'master' into numbers-schema-inference
2022-07-20 13:32:39 +02:00
Kruglov Pavel
3046cd6d29
Merge branch 'master' into schema-inference-cache
2022-07-20 13:30:42 +02:00
avogar
784ee11594
Add settings to skip fields with unsupported types in Protobuf/CapnProto schema inference
2022-07-20 11:16:25 +00:00
Kruglov Pavel
a1b63b4a02
Fix style
2022-07-20 12:07:22 +02:00
avogar
4f020654be
Get rid of unneded ifdefs
2022-07-19 12:12:40 +00:00
avogar
6eb234a1cc
Avoid abort() in capnproto on exception descruction
2022-07-18 19:53:24 +00:00
Robert Schulze
32637cb1b9
Fix build
2022-07-18 07:58:59 +00:00
Robert Schulze
13482af4ee
First try at reducing the use of StringRef
...
- to be replaced by std::string_view
- suggested in #39262
2022-07-17 17:26:02 +00:00
Robert Schulze
deda29b46b
Pass const StringRef by value, not by reference
...
See #39224
2022-07-15 11:34:56 +00:00
Kruglov Pavel
b38241b08a
Merge branch 'master' into schema-inference-cache
2022-07-14 12:29:54 +02:00
avogar
7cde9d3b40
Add new features in schema inference
2022-07-13 15:57:55 +00:00
vdimir
63aebd17b2
Remove TabSeparatedSorted
2022-07-12 20:22:35 +02:00
vdimir
46df417c2e
Fix empty line sorting in TabSeparatedSorted
2022-07-12 20:22:35 +02:00
vdimir
f51b25b262
clickhouse test ignore order via special format
2022-07-12 20:22:35 +02:00
Kruglov Pavel
4080f055b6
Merge pull request #38477 from Avogar/sql-insert-format
...
Add SQLInsert output format
2022-07-04 15:06:33 +02:00
avogar
5b0fd31c64
Put column names in quotes
2022-06-30 16:14:30 +00:00
Antonio Andelic
de264117fd
Merge pull request #38118 from bigo-sg/storagehive_struct_type
...
Add struct type support in `StorageHive`
2022-06-30 09:11:13 +02:00
mergify[bot]
9482c99ab8
Merge branch 'master' into sql-insert-format
2022-06-29 11:03:07 +00:00
Robert Schulze
f692ead6ad
Don't use std::unique_lock unless we have to
...
Replace where possible by std::lock_guard which is more light-weight.
2022-06-28 19:19:06 +00:00
avogar
9bb68bc6de
Add SQLInsert output format
2022-06-27 18:31:57 +00:00
avogar
5155262a16
Add some additional information to cache keys
2022-06-27 12:43:24 +00:00
lgbo-ustc
cd8e5c7c49
update headers
2022-06-23 17:43:54 +08:00
lgbo-ustc
96e6f9a2d0
fixed code style
2022-06-23 16:10:01 +08:00
lgbo-ustc
c1770c22b9
Merge remote-tracking branch 'ck/master' into storagehive_struct_type
2022-06-23 15:54:20 +08:00
Kseniia Sumarokova
e48ce50863
Update ArrowBufferedStreams.cpp
2022-06-20 19:12:51 +02:00
kssenii
5dd1bb2fd8
improvements for getFileSize
2022-06-20 15:22:56 +02:00
lgbo-ustc
8c629085e4
simplified code
2022-06-17 09:36:59 +08:00
lgbo-ustc
35d534c213
nested struct in struct
2022-06-16 16:45:05 +08:00
Alexey Milovidov
5e9e5a4eaf
Merge pull request #37525 from Avogar/avro-structs
...
Support Maps and Records, allow to insert null as default in Avro format
2022-06-15 00:04:29 +03:00
Kseniia Sumarokova
0ae2168fb6
Merge pull request #36328 from bigo-sg/async_hdfs_read_buffer
...
Apply read_method 'threadpool' for StorageHive
2022-06-10 15:04:21 +02:00
taiyang-li
9fd9ff66bd
remove some test code
2022-06-09 09:55:50 +08:00
taiyang-li
c65c56fd48
fix typo
2022-06-07 09:58:29 +08:00
mergify[bot]
ddf7210ecc
Merge branch 'master' into remove-useless-code-2
2022-06-03 13:58:45 +00:00
taiyang-li
f202c35311
Merge branch 'master' into async_hdfs_read_buffer
2022-06-03 17:52:09 +08:00
Paul Loyd
32d267ec6c
Stop removing UTF-8 BOM in RowBinary* formats
...
Fixes #37420
2022-06-01 13:12:55 +08:00
Maksim Kita
bacee7f19c
Merge pull request #37195 from kitaisreal/merging-sorted-algorithm-single-column-specialization
...
MergingSortedAlgorithm single column specialization
2022-05-31 16:46:18 +02:00
taiyang-li
047387bf1c
fix 2 bugs: 1. select count(1) from hive_table; 2. select _file, _path from hive_table
2022-05-31 17:39:02 +08:00
avogar
4c9812d4c1
Allow to skip some of the first rows in CSV/TSV formats
2022-05-25 15:00:11 +00:00
avogar
038a422aeb
Add setting to insert null as default
2022-05-25 12:56:59 +00:00
avogar
7817d6aea3
Support Maps and Records in Avro format
2022-05-25 11:20:28 +00:00
Maksim Kita
83554d1f2d
Fixed style
2022-05-25 13:05:39 +02:00
Maksim Kita
9a9df26eec
Fixed tests
2022-05-25 11:44:37 +02:00
Kruglov Pavel
6c9a524f6b
Merge pull request #37192 from Avogar/formats-with-names
...
Improve performance and memory usage for select of subset of columns for some formats
2022-05-24 13:28:14 +02:00
avogar
3651ef93fe
Fix performance test
2022-05-23 17:42:13 +00:00
avogar
034c7122be
Mark JSONColumns supports subset of columns
2022-05-23 15:26:01 +00:00
avogar
ce4adb447f
Fix named tuples output in ORC/Arrow/Parquet formats
2022-05-23 14:21:08 +00:00
Kruglov Pavel
f539fb835d
Merge branch 'master' into formats-with-names
2022-05-23 12:14:20 +02:00
Kruglov Pavel
ce48e8e102
Merge pull request #36975 from Avogar/json-columns-formats
...
Add columnar JSON formats
2022-05-23 12:11:28 +02:00
Kruglov Pavel
9bc74439c1
Merge pull request #37327 from Avogar/arrow-strings
...
Allow to use String type instead of Binary in Arrow/Parquet/ORC formats
2022-05-23 12:05:33 +02:00
mergify[bot]
747aa5575c
Merge branch 'master' into remove-useless-code-2
2022-05-22 17:41:57 +00:00
Anton Popov
cb0e6c2718
mark all operators bool() as explicit
2022-05-20 15:29:54 +00:00
avogar
566d1b15fd
Merge branch 'master' of github.com:ClickHouse/ClickHouse into formats-with-names
2022-05-20 13:54:52 +00:00
avogar
d2304f5d15
Make better
2022-05-20 12:07:29 +00:00
avogar
a6a430c5ee
Merge branch 'master' of github.com:ClickHouse/ClickHouse into json-columns-formats
2022-05-20 11:08:30 +00:00
mergify[bot]
1ac4199e78
Merge branch 'master' into arrow-strings
2022-05-20 10:43:33 +00:00
avogar
cd6a29897e
Apply input_format_max_rows_to_read_for_schema_inference for all files in globs in total
2022-05-18 17:56:36 +00:00
Kruglov Pavel
d81616ff65
Remove unnecessary include
2022-05-18 17:44:39 +02:00
avogar
a0369fb9a6
Allow to use String type instead of Binary in Arrow/Parquet/ORC formats
2022-05-18 14:51:21 +00:00
avogar
12010a81b7
Make better
2022-05-18 09:25:26 +00:00
Robert Schulze
0c55ac76d2
A few clangtidy updates
...
Enable:
- bugprone-lambda-function-name: "Checks for attempts to get the name of
a function from within a lambda expression. The name of a lambda is
always something like operator(), which is almost never what was
intended."
- bugprone-unhandled-self-assignment: "Finds user-defined copy
assignment operators which do not protect the code against
self-assignment either by checking self-assignment explicitly or using
the copy-and-swap or the copy-and-move method.""
- hicpp-invalid-access-moved: "Warns if an object is used after it has
been moved."
- hicpp-use-noexcept: "This check replaces deprecated dynamic exception
specifications with the appropriate noexcept specification (introduced
in C++11)"
- hicpp-use-override: "Adds override (introduced in C++11) to overridden
virtual functions and removes virtual from those functions as it is
not required."
- performance-type-promotion-in-math-fn: "Finds calls to C math library
functions (from math.h or, in C++, cmath) with implicit float to
double promotions."
Split up:
- cppcoreguidelines-*. Some of them may be useful (haven't checked in
detail), therefore allow to toggle them individually.
Disable:
- linuxkernel-*. Obvious.
2022-05-17 20:56:57 +02:00
Kruglov Pavel
8572879c37
Remove redundant code
2022-05-16 17:58:20 +02:00
Robert Schulze
e3cfec5b09
Merge remote-tracking branch 'origin/master' into clangtidies
2022-05-16 10:12:50 +02:00
avogar
68bb07d166
Better naming
2022-05-13 18:39:19 +00:00
avogar
cef13c2c02
Allow to skip unknown columns in Native format
2022-05-13 14:27:15 +00:00
avogar
b17fec659a
Improve performance and memory usage for select of subset of columns for some formats
2022-05-13 13:51:28 +00:00
mergify[bot]
4a661b6e78
Merge branch 'master' into json-columns-formats
2022-05-13 11:32:03 +00:00
avogar
02679c7222
Fix tests
2022-05-10 16:27:59 +00:00
avogar
ea0362b3a3
Fix tests
2022-05-10 16:20:38 +00:00
avogar
9abdacdd2e
Remove logging
2022-05-09 13:30:41 +00:00
avogar
054318b555
Fix invalid output LowCardinality -> ArrowDictionary
2022-05-09 13:29:42 +00:00
avogar
1e8d7ae749
Fix
2022-05-09 11:29:40 +00:00
avogar
04fdd75c56
Make JSONColumns frormats mono block by default
2022-05-09 11:13:44 +00:00
Robert Schulze
1b81bb49b4
Enable clang-tidy modernize-deprecated-headers & hicpp-deprecated-headers
...
Official docs:
Some headers from C library were deprecated in C++ and are no longer
welcome in C++ codebases. Some have no effect in C++. For more details
refer to the C++ 14 Standard [depr.c.headers] section. This check
replaces C standard library headers with their C++ alternatives and
removes redundant ones.
2022-05-09 08:23:33 +02:00
Robert Schulze
7d3913f350
Enable clang-tidy bugprone-assert-side-effect
...
Official docs:
Finds assert() with side effect. The condition of assert() is
evaluated only in debug builds so a condition with side effect can
cause different behavior in debug / release builds.
2022-05-08 19:15:55 +02:00
avogar
3a13c3e372
Fix comments
2022-05-06 16:50:34 +00:00
avogar
62a7ba3f26
Add columnar JSON formats
2022-05-06 16:48:48 +00:00
Kruglov Pavel
77e55c344c
Merge pull request #36667 from Avogar/mysqldump-format
...
Add MySQLDump input format
2022-05-04 19:49:48 +02:00
Kruglov Pavel
ffec3655fe
Fix special build
2022-05-04 17:14:15 +02:00
mergify[bot]
64084b5e32
Merge branch 'master' into shared_ptr_helper3
2022-05-03 20:46:16 +00:00
Dmitry Novik
5ba7a55c18
Merge pull request #36650 from bigo-sg/hive_text_parallel_parsing
...
Parallel parsing of hive text format
2022-05-03 15:56:28 +02:00
Kruglov Pavel
d613f7eab0
Merge branch 'master' into mysqldump-format
2022-05-02 13:31:57 +02:00
Antonio Andelic
a1a22b0007
Merge pull request #35149 from ContentSquare/nullables_with_proto3
...
Nullables with proto3 using Google wrappers
2022-05-02 09:49:37 +02:00
Robert Schulze
330212e0f4
Remove inherited create() method + disallow copying
...
The original motivation for this commit was that shared_ptr_helper used
std::shared_ptr<>() which does two heap allocations instead of
make_shared<>() which does a single allocation. Turned out that
1. the affected code (--> Storages/) is not on a hot path (rendering the
performance argument moot ...)
2. yet copying Storage objects is potentially dangerous and was
previously allowed.
Hence, this change
- removes shared_ptr_helper and as a result all inherited create() methods,
- instead, Storage objects are now created using make_shared<>() by the
caller (for that to work, many constructors had to be made public), and
- all Storage classes were marked as noncopyable using boost::noncopyable.
In sum, we are (likely) not making things faster but the code becomes
cleaner and harder to misuse.
2022-05-02 08:46:52 +02:00
Robert Schulze
89aa9ae00f
Fixed clang-tidy check "bugprone-branch-clone"
...
The check is currently *not* part of .clang-tidy. It complains about:
(1) "switch has multiple consecutive identical branches"
(2) "repeated branch in conditional chain"
About (1): Lots of findings in switches were about redundant
"[[fallthrough]]" in places where the compiler would not warn anyways. I
have cleaned these up.
About (2): In if-else_if-else chains, fixing the warning would usually
mean concatenating multiple if-conditions. As this would reduce
readability in most cases, I did not fix these places.
Because of (2), I also refrained from adding "bugprone-branch-clone" to
.clang-tidy.
2022-04-30 19:40:28 +02:00
mergify[bot]
cc08ccb420
Merge branch 'master' into remove-useless-code-2
2022-04-30 12:48:15 +00:00
Jakub Kuklis
a1f2dd6d34
Adding two settings in place of one, improvements to the test clarity
2022-04-29 10:01:51 +02:00
Jakub Kuklis
507ba1042c
Adding a setting to enable Google wrappers special treatment
2022-04-29 10:01:51 +02:00
Jakub Kuklis
6d5c1e2fc0
Adding a setting to enable special treatment of google wrappers
2022-04-29 10:01:50 +02:00
Amos Bird
4a5e4274f0
base should not depend on Common
2022-04-29 10:26:35 +08:00
avogar
d295de1689
Fix comments and test
2022-04-28 14:59:35 +00:00
Kruglov Pavel
4d08587559
Merge branch 'master' into mysqldump-format
2022-04-28 15:58:18 +02:00
Kseniia Sumarokova
4c371f710e
Merge pull request #36676 from kssenii/refactor-with-size-buffer
...
Better version of SeekableReadBufferWithSize
2022-04-28 13:44:25 +02:00
taiyang-li
99aa5fdc81
remove useless code
2022-04-27 11:15:04 +08:00
vdimir
81b86799e7
Fixup PrometheusTextOutputFormat
2022-04-26 14:57:37 +00:00
vdimir
d5d98ed951
PrometheusTextOutputFormat: support lables, histograms and summaries
2022-04-26 14:57:36 +00:00
vdimir
be0aa06958
Add output format Prometheus
2022-04-26 14:57:35 +00:00
kssenii
9d364cdce2
Refactor
2022-04-26 15:33:53 +02:00
Kruglov Pavel
a462d94157
Fix error codes
2022-04-26 13:25:07 +02:00
Kruglov Pavel
e3b222b519
Fix typo
2022-04-26 13:24:10 +02:00
avogar
33d845dade
Add MySQLDump input format
2022-04-26 10:42:56 +00:00
taiyang-li
99dee35b6e
parallel parsing of hive text format
2022-04-26 14:33:10 +08:00
avogar
80eacc8533
Merge branch 'master' of github.com:ClickHouse/ClickHouse into improve-json-schema-inference
2022-04-22 17:18:44 +00:00
Kseniia Sumarokova
33bb48106f
Merge pull request #36314 from CurtizJ/print-bad-filenames
...
Show names of erroneous files in case of parsing errors while executing table functions
2022-04-22 13:24:55 +02:00
mergify[bot]
e38a3c3595
Merge branch 'master' into alias
2022-04-21 15:02:30 +00:00
Alexander Tokmakov
1d30a97fd2
Merge branch 'master' into remove-useless-code-2
2022-04-20 11:45:56 +02:00
Robert Schulze
b24ca8de52
Fix various clang-tidy warnings
...
When I tried to add cool new clang-tidy 14 warnings, I noticed that the
current clang-tidy settings already produce a ton of warnings. This
commit addresses many of these. Almost all of them were non-critical,
i.e. C vs. C++ style casts.
2022-04-20 10:29:05 +02:00
Anton Popov
3e361c9759
Merge remote-tracking branch 'upstream/master' into HEAD
2022-04-19 14:18:04 +00:00
mergify[bot]
4fed033dca
Merge branch 'master' into alias
2022-04-17 14:37:04 +00:00
fenglv
2392d4e2b5
fix
2022-04-16 16:08:28 +00:00
fenglv
58111115c5
fix style
2022-04-16 06:21:09 +00:00
fenglv
74ef1b0198
Add aliases JSONLines and NDJSON for JSONEachRow
2022-04-16 06:01:07 +00:00
Anton Popov
2de6668b3f
show names of erroneous files
2022-04-16 00:10:47 +00:00
Alexey Milovidov
cbeeb7ec4f
Remove Arcadia
2022-04-16 00:20:47 +02:00
avogar
42726639f3
Check ORC/Parquet/Arrow format magic bytes before loading file in memory
2022-04-13 19:27:38 +00:00
avogar
8b60aeb7bc
Improve schema inference for json objects
2022-04-13 19:13:40 +00:00
Alexey Milovidov
a54c01cf72
Remove useless code in ReplicatedMergeTreeRestartingThread
2022-04-11 00:44:30 +02:00
avogar
1c783ed88a
Resolve conflicts
2022-04-07 12:17:48 +00:00
avogar
d2017a63b1
Merge branch 'master' of github.com:ClickHouse/ClickHouse into improve-schema-inference
2022-04-07 11:36:40 +00:00
Kruglov Pavel
f3f8f27db5
Merge pull request #35735 from Avogar/allow-read-bools-as-numbers
...
Allow to infer and parse bools as numbers in JSON input formats
2022-04-07 13:20:49 +02:00
taiyang-li
2ef316801c
Merge branch 'master' into use_minmax_index
2022-04-07 10:53:25 +08:00
Kruglov Pavel
ec2213493f
Merge branch 'master' into allow-read-bools-as-numbers
2022-04-06 14:53:02 +02:00
Kruglov Pavel
9141066de3
Merge branch 'master' into improve-schema-inference
2022-04-06 13:51:07 +02:00
taiyang-li
acb9f1632e
suppoort skip splits in orc and parquet
2022-04-06 16:40:22 +08:00
mergify[bot]
1e43e26fa1
Merge branch 'master' into fix-order
2022-04-02 12:00:29 +00:00
avogar
ab2a963287
Merge branch 'master' of github.com:ClickHouse/ClickHouse into allow-read-bools-as-numbers
2022-03-31 14:09:43 +00:00
mergify[bot]
24ade25d61
Merge branch 'master' into improve-schema-inference
2022-03-31 13:42:47 +00:00
avogar
3fc36627b3
Allow to infer and parse bools as numbers in JSON input formats
2022-03-29 17:37:31 +00:00
avogar
ce97ccbfb9
Improve schema inference for JSONEachRow and TSKV formats
2022-03-29 14:47:51 +00:00
Antonio Andelic
9990abb76a
Use compile-time check for Exception messages, fix wrong messages
2022-03-29 13:16:11 +00:00
avogar
97f5033ea9
Fix tests
2022-03-29 13:07:37 +00:00
mergify[bot]
343588de2c
Merge branch 'master' into improve-schema-inference
2022-03-29 13:06:00 +00:00
Anton Popov
d677635cd8
Merge pull request #35592 from CurtizJ/dynamic-columns-4
...
Add parallel parsing and schema inference for format `JSONAsObject`
2022-03-28 19:29:55 +02:00
avogar
6fb3c3be04
Fix comments and build
2022-03-25 12:02:21 +00:00
Kruglov Pavel
d45143ffe0
Merge branch 'master' into improve-schema-inference
2022-03-25 12:05:40 +01:00
Anton Popov
78100abc5f
add parallel parsing and schema inference for type Object
2022-03-24 17:51:35 +00:00
avogar
557edbd172
Add some improvements and fixes in schema inference
2022-03-24 12:54:12 +00:00
mergify[bot]
bf90edc362
Merge branch 'master' into case-insensitive-column-matching
2022-03-24 08:00:42 +00:00
Kruglov Pavel
826b933b08
Merge pull request #35332 from Avogar/fix-tskv-schema-inference
...
Fix schema inference for TSKV format while using small max_read_buffer_size
2022-03-23 18:37:07 +01:00