ClickHouse/docs/en/formats/capnproto.rst
Marek Vavruša 64a892c0e6 DataStreams: CapnProto uses <format_schema_path> config option
This addresses one of the remarks in the PR.

All format schemas are required to be in the <format_schema_path> directory.
This makes loading schema files less tedious, as the path can be omitted.
2017-11-15 23:17:22 +03:00

30 lines
1.1 KiB
ReStructuredText

CapnProto
---------
Cap'n Proto is a binary message format. Like Protocol Buffers and Thrift (but unlike JSON or MessagePack), Cap'n Proto messages are strongly-typed and not self-describing. Due to this, it requires a ``schema`` setting to specify schema file and the root object. The schema is parsed on runtime and cached for each SQL statement.
.. code-block:: sql
SELECT SearchPhrase, count() AS c FROM test.hits
GROUP BY SearchPhrase FORMAT CapnProto SETTINGS schema = 'schema:Message'
When the `schema.capnp` schema file looks like:
.. code-block:: text
struct Message {
SearchPhrase @0 :Text;
c @1 :Uint64;
}
The schema files are located in the path specified in the configuration file:
.. code-block:: xml
<!-- Directory containing schema files for various input formats. -->
<format_schema_path>format_schemas/</format_schema_path>
Deserialization is almost as efficient as the binary rows format, with typically zero allocation overhead per message.
You can use this format as an efficient exchange message format in your data processing pipeline.