Introduce IO format "ProtobufList" with protobuf schema
// schemafile.proto
message Envelope {
message MessageType {
uint32 colA = 1;
string colB = 2;
}
repeated MessageType mt = 1;
}
where "Envelope" is a hard-coded/expected top-level message and
"MessageType" is a message with user-provided name containing the table
fields to export/import, e.g.
SELECT * FROM db1.tab1 FORMAT ProtobufList SETTINGS format_schema =
'schemafile:MessageType'
As a result, the new format wraps a list of messages (one per row) into
a single, containing message. Compare that to the schema of the existing
IO formats "Protobuf" and "ProtobufSingle":
message MessageType {
uint32 colA = 1;
string colB = 2;
}
The new format does not save space compared to the existing formats, but
it is conceptually a bit more beautiful and also more convenenient.
Implementation details:
- Created new files ProtobufList(Input|Output)Format which use the
existing ProtobufSerializer mechanism. The goal was to reuse as much
code as possible and avoid copypasta.
- I was torn between inheriting from I(Input|Output)Format vs.
IRow(Input|Output)Format for ProtobufList(Input|Output)Format. The
former is chunk-based which can be better for performance. Since the
ProtobufSerializer mechanism is row-based but data is generally passed
around in chunks, I decided for the latter to leverage the existing
chunk <--> row mapping code in IRow(InputOutput)Format.
- A new ProtobufSerializer called ProtobufSerializerEnvelope was
introduced (--> ProtobufSerializer.cpp). It represents the top-level
message which encloses the list of inner nested messages, i.e. the
rows.
- With the new format, parsing the schema file and matching the fields in
the schema file to table column works like for the old formats. The only
difference is that parsing starts one level below the "Envelope" (-->
ProtobufSchema.cpp). This is more natural than forcing customers to
have table columns start with "Envelope".
- Creation of the ProtobufSerializer tree also works like before. What
is different is that we finally add a ProtobufSerializerEnvelope as
new root of the tree. It's only purpose is to write/read the top-level
message for the first/last row to write/read.
Caveats:
- The low-level serialization code in ProtobufWriter uses an internal
buffer which is flushed to the output file only in endMessage().
In the existing "Protobuf" format, this happens once per row, in the
new format this happens only at the end of the serialization
since row-level messages now call start/endNestedMessage(). As a
future TODO to, the buffer should be flushed also in
start/endNestedMessage() to reduce memory consumption.
SELECT from Buffer table is racy, so you can get data from the
underlying table but not from the Buffer itself, since in parallel with
SELECT, Buffer, can flush it's data to the underlying table.
It is hard to avoid with the current architecture, since this will
require to holding lock until the data will be read from the Buffer, and
this is not a good alternative.
So let's fix the test instead, but not relying on background flush (TTL
increased).
Here is an example of a test failure [1]:
2022.03.12 20:56:58.141182 [ 678 ] {011e7d25-82a9-4ab6-8cb0-dcbbc84f9581} <Debug> executeQuery: (from [::1]:33324) (comment: 01506_buffer_table_alter_block_structure_2.sql) SELECT * FROM buf ORDER BY timestamp;
2022.03.12 20:56:58.162709 [ 678 ] {011e7d25-82a9-4ab6-8cb0-dcbbc84f9581} <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part 20200101_1_1_0, approx. 1 rows starting from 0
2022.03.12 20:56:59.144663 [ 615 ] {} <Trace> test_bdtzgu.buf_dest (79ba36b2-0e90-4bbb-b55f-a42b605b362b): Renaming temporary part tmp_insert_20200101_2_2_0 to 20200101_2_2_0.
2022.03.12 20:56:59.147550 [ 615 ] {} <Debug> StorageBuffer (test_bdtzgu.buf): Flushing buffer with 1 rows, 18 bytes, age 1 seconds, took 19 ms (bg).
2022.03.12 20:56:59.391774 [ 678 ] {011e7d25-82a9-4ab6-8cb0-dcbbc84f9581} <Information> executeQuery: Read 1 rows, 13.00 B in 1.250102785 sec., 0 rows/sec., 10.40 B/sec.
[1]: https://s3.amazonaws.com/clickhouse-test-reports/0/044cd6b861c1f4f00c6c24c4020799b676de6d34/stateless_tests__memory__actions__[1/3].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>