Commit Graph

65 Commits

Author SHA1 Message Date
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
Alexey Milovidov
e855d3519a
Merge branch 'master' into refactoring-ip-types 2023-01-02 21:58:53 +03:00
Kruglov Pavel
894726bd8f
Merge branch 'master' into improve-streaming-engines 2022-12-29 22:59:45 +01:00
Yakov Olkhovskiy
bf9194f405 review suggestions 2022-12-07 21:29:17 +00:00
Dmitry Novik
15b8c48ca9 Cleanup code 2022-12-02 19:15:26 +00:00
Dmitry Novik
2c70dbc76a Refactor FunctionNode 2022-12-02 19:15:26 +00:00
Yakov Olkhovskiy
4d144be39c replace domain IP types (IPv4, IPv6) with native 2022-11-14 14:17:17 +00:00
Kruglov Pavel
b124875257
Merge branch 'master' into improve-streaming-engines 2022-11-03 13:22:06 +01:00
avogar
8e13d1f1ec Improve and refactor Kafka/StorageMQ/NATS and data formats 2022-10-28 16:41:10 +00:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Alexander Tokmakov
4175f8cde6 abort instead of __builtin_unreachable in debug builds 2022-10-07 21:49:08 +02:00
Alexey Milovidov
84f42e0874 Fix 3/4 of trash 2022-09-19 08:50:53 +02:00
avogar
17a271ec30 Fix error codes 2022-07-20 14:33:46 +00:00
avogar
784ee11594 Add settings to skip fields with unsupported types in Protobuf/CapnProto schema inference 2022-07-20 11:16:25 +00:00
avogar
3f81aadb60 Fix schema inference in case of empty messages in Protobuf/CapnProto formats 2022-07-18 17:53:33 +00:00
avogar
9291d33080 Pass const std::string_view & by value, not by reference 2022-07-14 16:11:57 +00:00
Antonio Andelic
a1a22b0007
Merge pull request #35149 from ContentSquare/nullables_with_proto3
Nullables with proto3 using Google wrappers
2022-05-02 09:49:37 +02:00
Robert Schulze
89aa9ae00f
Fixed clang-tidy check "bugprone-branch-clone"
The check is currently *not* part of .clang-tidy. It complains about:
(1) "switch has multiple consecutive identical branches"
(2) "repeated branch in conditional chain"

About (1): Lots of findings in switches were about redundant
"[[fallthrough]]" in places where the compiler would not warn anyways. I
have cleaned these up.

About (2): In if-else_if-else chains, fixing the warning would usually
mean concatenating multiple if-conditions. As this would reduce
readability in most cases, I did not fix these places.

Because of (2), I also refrained from adding "bugprone-branch-clone" to
.clang-tidy.
2022-04-30 19:40:28 +02:00
Jakub Kuklis
a1f2dd6d34 Adding two settings in place of one, improvements to the test clarity 2022-04-29 10:01:51 +02:00
Jakub Kuklis
e73fa271a2 Minor improvements 2022-04-29 10:01:51 +02:00
Jakub Kuklis
5ca095c779 Pass the setting to buildFieldSerializer to fix undeclared 2022-04-29 10:01:51 +02:00
Jakub Kuklis
e705425374 Minor improvements 2022-04-29 10:01:51 +02:00
Jakub Kuklis
5c34585a00 Improve the test clarity 2022-04-29 10:01:51 +02:00
Jakub Kuklis
f19e473482 Remove local change 2022-04-29 10:01:51 +02:00
Jakub Kuklis
507ba1042c Adding a setting to enable Google wrappers special treatment 2022-04-29 10:01:51 +02:00
Jakub Kuklis
6d5c1e2fc0 Adding a setting to enable special treatment of google wrappers 2022-04-29 10:01:50 +02:00
Jakub Kuklis
b7a8acc302 Alternative design for output, mory messy, but the default value inside Google wrapper is not serialized 2022-04-29 10:01:50 +02:00
Jakub Kuklis
53e2454800 Corrected the behaviour for Proto Nullable output 2022-04-29 10:01:50 +02:00
Jakub Kuklis
10425c17b2 Write empty values for Google wrappers 2022-04-29 10:01:50 +02:00
Jakub Kuklis
ff49fad1f1 Another const keyword corrections for debug build 2022-04-29 10:01:50 +02:00
Jakub Kuklis
08ee7470f0 const keyword corrections for debug build 2022-04-29 10:01:50 +02:00
Jakub Kuklis
c0acc4dfa0 Fixing assert 2022-04-29 10:01:50 +02:00
Jakub Kuklis
7a78197746 Style corrections 2022-04-29 10:01:50 +02:00
Jakub Kuklis
ae1194bf9c Nullables detection in protobuf using Google wrappers 2022-04-29 10:01:50 +02:00
Amos Bird
4a5e4274f0
base should not depend on Common 2022-04-29 10:26:35 +08:00
Robert Schulze
6e1d7a31bc
Fix build + typo 2022-03-17 11:41:20 +01:00
Robert Schulze
23122cb327
Fix review comments
ParquetBlockOutputFormat.cpp:
- undo unrelated formatting

ProtobufSerializer.cpp:
- undef debug tracing
- simplify logic in writeRow()

ProtobufSchemas.cpp:
- restore original search in cache by message type
2022-03-15 11:27:17 +01:00
Robert Schulze
514d4d2187
Implement ProtobufList - fixes ClickHouse#16436
Introduce IO format "ProtobufList" with protobuf schema

    // schemafile.proto
    message Envelope {
      message MessageType {
        uint32 colA = 1;
        string colB = 2;
      }
      repeated MessageType mt = 1;
    }

where "Envelope" is a hard-coded/expected top-level message and
"MessageType" is a message with user-provided name containing the table
fields to export/import, e.g.

    SELECT * FROM db1.tab1 FORMAT ProtobufList SETTINGS format_schema =
    'schemafile:MessageType'

As a result, the new format wraps a list of messages (one per row) into
a single, containing message. Compare that to the schema of the existing
IO formats "Protobuf" and "ProtobufSingle":

    message MessageType {
      uint32 colA = 1;
      string colB = 2;
    }

The new format does not save space compared to the existing formats, but
it is conceptually a bit more beautiful and also more convenenient.

Implementation details:

- Created new files ProtobufList(Input|Output)Format which use the
  existing ProtobufSerializer mechanism. The goal was to reuse as much
  code as possible and avoid copypasta.

- I was torn between inheriting from I(Input|Output)Format vs.
  IRow(Input|Output)Format for ProtobufList(Input|Output)Format. The
  former is chunk-based which can be better for performance. Since the
  ProtobufSerializer mechanism is row-based but data is generally passed
  around in chunks, I decided for the latter to leverage the existing
  chunk <--> row mapping code in IRow(InputOutput)Format.

- A new ProtobufSerializer called ProtobufSerializerEnvelope was
  introduced (--> ProtobufSerializer.cpp). It represents the top-level
  message which encloses the list of inner nested messages, i.e. the
  rows.

- With the new format, parsing the schema file and matching the fields in
  the schema file to table column works like for the old formats. The only
  difference is that parsing starts one level below the "Envelope" (-->
  ProtobufSchema.cpp). This is more natural than forcing customers to
  have table columns start with "Envelope".

- Creation of the ProtobufSerializer tree also works like before. What
  is different is that we finally add a ProtobufSerializerEnvelope as
  new root of the tree. It's only purpose is to write/read the top-level
  message for the first/last row to write/read.

Caveats:

- The low-level serialization code in ProtobufWriter uses an internal
  buffer which is flushed to the output file only in endMessage().
  In the existing "Protobuf" format, this happens once per row, in the
  new format this happens only at the end of the serialization
  since row-level messages now call start/endNestedMessage(). As a
  future TODO to, the buffer should be flushed also in
  start/endNestedMessage() to reduce memory consumption.
2022-03-14 08:04:58 +01:00
Maksim Kita
5ef83deaa6 Update sort to pdqsort 2022-01-30 19:49:48 +00:00
avogar
dd994aa761 Add some tests and some fixes 2021-12-29 12:18:56 +03:00
avogar
8112a71233 Implement schema inference for most input formats 2021-12-29 12:18:56 +03:00
alesapin
884801e1bd Fixing 2021-12-14 19:08:08 +03:00
Alexey Milovidov
71926a3a97 Fix surprisingly bad code in function "file" 2021-12-13 07:57:54 +03:00
Vitaly Baranov
73092942ea Take into account nested structures while filling missing columns while reading protobuf. 2021-12-10 21:11:06 +03:00
Vitaly Baranov
d709782088
Merge pull request #31988 from vitlibar/fix-skipping-columns-while-writing-protobuf
Fix skipping columns while writing protobuf
2021-12-05 18:01:11 +03:00
Vitaly Baranov
2e0b480044 Improve error handling while serializing protobufs. 2021-12-04 21:42:45 +03:00
Vitaly Baranov
15e3dbe3f2 Fix skipping columns in Nested while writing protobuf. 2021-12-04 18:00:02 +03:00
kssenii
37f482d478 Merge branch 'master' of github.com:ClickHouse/ClickHouse into versioning 2021-11-15 07:31:11 +00:00
Alexey Milovidov
fe6b7c77c7 Rename "common" to "base" 2021-10-02 10:13:14 +03:00
Alexey Milovidov
8adaef7c8e Make text format for Decimal tuneable 2021-08-16 11:03:23 +03:00