mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-21 15:12:02 +00:00
Add brief docs about apache arrow
This commit is contained in:
parent
ab37791134
commit
95e2f0fa07
@ -37,6 +37,8 @@ The supported formats are:
|
||||
| [Avro](#data-format-avro) | ✔ | ✔ |
|
||||
| [AvroConfluent](#data-format-avro-confluent) | ✔ | ✗ |
|
||||
| [Parquet](#data-format-parquet) | ✔ | ✔ |
|
||||
| [Arrow](#data-format-arrow) | ✔ | ✔ |
|
||||
| [ArrowStream](#data-format-arrow-stream) | ✔ | ✔ |
|
||||
| [ORC](#data-format-orc) | ✔ | ✗ |
|
||||
| [RowBinary](#rowbinary) | ✔ | ✔ |
|
||||
| [RowBinaryWithNamesAndTypes](#rowbinarywithnamesandtypes) | ✔ | ✔ |
|
||||
@ -985,9 +987,9 @@ See also [how to read/write length-delimited protobuf messages in popular langua
|
||||
|
||||
## Avro {#data-format-avro}
|
||||
|
||||
[Apache Avro](http://avro.apache.org/) is a row-oriented data serialization framework developed within Apache’s Hadoop project.
|
||||
[Apache Avro](https://avro.apache.org/) is a row-oriented data serialization framework developed within Apache’s Hadoop project.
|
||||
|
||||
ClickHouse Avro format supports reading and writing [Avro data files](http://avro.apache.org/docs/current/spec.html#Object+Container+Files).
|
||||
ClickHouse Avro format supports reading and writing [Avro data files](https://avro.apache.org/docs/current/spec.html#Object+Container+Files).
|
||||
|
||||
### Data Types Matching {#data_types-matching}
|
||||
|
||||
@ -1009,7 +1011,7 @@ The table below shows supported data types and how they match ClickHouse [data t
|
||||
| `long (timestamp-millis)` \* | [DateTime64(3)](../sql-reference/data-types/datetime.md) | `long (timestamp-millis)` \* |
|
||||
| `long (timestamp-micros)` \* | [DateTime64(6)](../sql-reference/data-types/datetime.md) | `long (timestamp-micros)` \* |
|
||||
|
||||
\* [Avro logical types](http://avro.apache.org/docs/current/spec.html#Logical+Types)
|
||||
\* [Avro logical types](https://avro.apache.org/docs/current/spec.html#Logical+Types)
|
||||
|
||||
Unsupported Avro data types: `record` (non-root), `map`
|
||||
|
||||
@ -1095,7 +1097,7 @@ SELECT * FROM topic1_stream;
|
||||
|
||||
## Parquet {#data-format-parquet}
|
||||
|
||||
[Apache Parquet](http://parquet.apache.org/) is a columnar storage format widespread in the Hadoop ecosystem. ClickHouse supports read and write operations for this format.
|
||||
[Apache Parquet](https://parquet.apache.org/) is a columnar storage format widespread in the Hadoop ecosystem. ClickHouse supports read and write operations for this format.
|
||||
|
||||
### Data Types Matching {#data_types-matching-2}
|
||||
|
||||
@ -1141,6 +1143,16 @@ $ clickhouse-client --query="SELECT * FROM {some_table} FORMAT Parquet" > {some_
|
||||
|
||||
To exchange data with Hadoop, you can use [HDFS table engine](../engines/table-engines/integrations/hdfs.md).
|
||||
|
||||
## Arrow {data-format-arrow}
|
||||
|
||||
[Apache Arrow](https://arrow.apache.org/) comes with two built-in columnar storage formats. ClickHouse supports read and write operations for these formats.
|
||||
|
||||
`Arrow` is Apache Arrow's "file mode" format. It is designed for in-memory random access.
|
||||
|
||||
## ArrowStream {data-format-arrow-stream}
|
||||
|
||||
`ArrowStream` is Apache Arrow's "stream mode" format. It is designed for in-memory stream processing.
|
||||
|
||||
## ORC {#data-format-orc}
|
||||
|
||||
[Apache ORC](https://orc.apache.org/) is a columnar storage format widespread in the Hadoop ecosystem. You can only insert data in this format to ClickHouse.
|
||||
|
Loading…
Reference in New Issue
Block a user