Merge branch 'master' into new-nav

This commit is contained in:
Rich Raposa 2023-03-06 14:09:39 -07:00 committed by GitHub
commit 97447d3ef2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
25 changed files with 466 additions and 133 deletions

View File

@ -29,7 +29,7 @@ RUN arch=${TARGETARCH:-amd64} \
esac
ARG REPOSITORY="https://s3.amazonaws.com/clickhouse-builds/22.4/31c367d3cd3aefd316778601ff6565119fe36682/package_release"
ARG VERSION="23.2.1.2537"
ARG VERSION="23.2.3.17"
ARG PACKAGES="clickhouse-keeper"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -33,7 +33,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="23.2.1.2537"
ARG VERSION="23.2.3.17"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -22,7 +22,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="23.2.1.2537"
ARG VERSION="23.2.3.17"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image

View File

@ -0,0 +1,23 @@
---
sidebar_position: 1
sidebar_label: 2023
---
# 2023 Changelog
### ClickHouse release v23.2.3.17-stable (dec18bf7281) FIXME as compared to v23.2.2.20-stable (f6c269c8df2)
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#46907](https://github.com/ClickHouse/ClickHouse/issues/46907): - Fix incorrect alias recursion in QueryNormalizer. [#46609](https://github.com/ClickHouse/ClickHouse/pull/46609) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#47091](https://github.com/ClickHouse/ClickHouse/issues/47091): - Fix arithmetic operations in aggregate optimization with `min` and `max`. [#46705](https://github.com/ClickHouse/ClickHouse/pull/46705) ([Duc Canh Le](https://github.com/canhld94)).
* Backported in [#46885](https://github.com/ClickHouse/ClickHouse/issues/46885): Fix MSan report in the `maxIntersections` function. This closes [#43126](https://github.com/ClickHouse/ClickHouse/issues/43126). [#46847](https://github.com/ClickHouse/ClickHouse/pull/46847) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#47067](https://github.com/ClickHouse/ClickHouse/issues/47067): Fix typo in systemd service, which causes the systemd service start to fail. [#47051](https://github.com/ClickHouse/ClickHouse/pull/47051) ([Palash Goel](https://github.com/palash-goel)).
* Backported in [#47259](https://github.com/ClickHouse/ClickHouse/issues/47259): Fix concrete columns PREWHERE support. [#47154](https://github.com/ClickHouse/ClickHouse/pull/47154) ([Azat Khuzhin](https://github.com/azat)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Use /etc/default/clickhouse in systemd too [#47003](https://github.com/ClickHouse/ClickHouse/pull/47003) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* do flushUntrackedMemory when context switches [#47102](https://github.com/ClickHouse/ClickHouse/pull/47102) ([Sema Checherinda](https://github.com/CheSema)).
* Update typing for a new PyGithub version [#47123](https://github.com/ClickHouse/ClickHouse/pull/47123) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -1232,50 +1232,52 @@ Each row is formatted as a single document and each column is formatted as a sin
For output it uses the following correspondence between ClickHouse types and BSON types:
| ClickHouse type | BSON Type |
|-----------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| [Bool](/docs/en/sql-reference/data-types/boolean.md) | `\x08` boolean |
| [Int8/UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `\x10` int32 |
| [Int16UInt16](/docs/en/sql-reference/data-types/int-uint.md) | `\x10` int32 |
| [Int32](/docs/en/sql-reference/data-types/int-uint.md) | `\x10` int32 |
| [UInt32](/docs/en/sql-reference/data-types/int-uint.md) | `\x12` int64 |
| [Int64/UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `\x12` int64 |
| [Float32/Float64](/docs/en/sql-reference/data-types/float.md) | `\x01` double |
| [Date](/docs/en/sql-reference/data-types/date.md)/[Date32](/docs/en/sql-reference/data-types/date32.md) | `\x10` int32 |
| [DateTime](/docs/en/sql-reference/data-types/datetime.md) | `\x12` int64 |
| [DateTime64](/docs/en/sql-reference/data-types/datetime64.md) | `\x09` datetime |
| [Decimal32](/docs/en/sql-reference/data-types/decimal.md) | `\x10` int32 |
| [Decimal64](/docs/en/sql-reference/data-types/decimal.md) | `\x12` int64 |
| [Decimal128](/docs/en/sql-reference/data-types/decimal.md) | `\x05` binary, `\x00` binary subtype, size = 16 |
| [Decimal256](/docs/en/sql-reference/data-types/decimal.md) | `\x05` binary, `\x00` binary subtype, size = 32 |
| [Int128/UInt128](/docs/en/sql-reference/data-types/int-uint.md) | `\x05` binary, `\x00` binary subtype, size = 16 |
| [Int256/UInt256](/docs/en/sql-reference/data-types/int-uint.md) | `\x05` binary, `\x00` binary subtype, size = 32 |
| ClickHouse type | BSON Type |
|-----------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|
| [Bool](/docs/en/sql-reference/data-types/boolean.md) | `\x08` boolean |
| [Int8/UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `\x10` int32 |
| [Int16UInt16](/docs/en/sql-reference/data-types/int-uint.md) | `\x10` int32 |
| [Int32](/docs/en/sql-reference/data-types/int-uint.md) | `\x10` int32 |
| [UInt32](/docs/en/sql-reference/data-types/int-uint.md) | `\x12` int64 |
| [Int64/UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `\x12` int64 |
| [Float32/Float64](/docs/en/sql-reference/data-types/float.md) | `\x01` double |
| [Date](/docs/en/sql-reference/data-types/date.md)/[Date32](/docs/en/sql-reference/data-types/date32.md) | `\x10` int32 |
| [DateTime](/docs/en/sql-reference/data-types/datetime.md) | `\x12` int64 |
| [DateTime64](/docs/en/sql-reference/data-types/datetime64.md) | `\x09` datetime |
| [Decimal32](/docs/en/sql-reference/data-types/decimal.md) | `\x10` int32 |
| [Decimal64](/docs/en/sql-reference/data-types/decimal.md) | `\x12` int64 |
| [Decimal128](/docs/en/sql-reference/data-types/decimal.md) | `\x05` binary, `\x00` binary subtype, size = 16 |
| [Decimal256](/docs/en/sql-reference/data-types/decimal.md) | `\x05` binary, `\x00` binary subtype, size = 32 |
| [Int128/UInt128](/docs/en/sql-reference/data-types/int-uint.md) | `\x05` binary, `\x00` binary subtype, size = 16 |
| [Int256/UInt256](/docs/en/sql-reference/data-types/int-uint.md) | `\x05` binary, `\x00` binary subtype, size = 32 |
| [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) | `\x05` binary, `\x00` binary subtype or \x02 string if setting output_format_bson_string_as_string is enabled |
| [UUID](/docs/en/sql-reference/data-types/uuid.md) | `\x05` binary, `\x04` uuid subtype, size = 16 |
| [Array](/docs/en/sql-reference/data-types/array.md) | `\x04` array |
| [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `\x04` array |
| [Named Tuple](/docs/en/sql-reference/data-types/tuple.md) | `\x03` document |
| [Map](/docs/en/sql-reference/data-types/map.md) (with String keys) | `\x03` document |
| [UUID](/docs/en/sql-reference/data-types/uuid.md) | `\x05` binary, `\x04` uuid subtype, size = 16 |
| [Array](/docs/en/sql-reference/data-types/array.md) | `\x04` array |
| [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `\x04` array |
| [Named Tuple](/docs/en/sql-reference/data-types/tuple.md) | `\x03` document |
| [Map](/docs/en/sql-reference/data-types/map.md) (with String keys) | `\x03` document |
| [IPv4](/docs/en/sql-reference/data-types/domains/ipv4.md) | `\x10` int32 |
| [IPv6](/docs/en/sql-reference/data-types/domains/ipv6.md) | `\x05` binary, `\x00` binary subtype |
For input it uses the following correspondence between BSON types and ClickHouse types:
| BSON Type | ClickHouse Type |
|------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `\x01` double | [Float32/Float64](/docs/en/sql-reference/data-types/float.md) |
| `\x02` string | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x03` document | [Map](/docs/en/sql-reference/data-types/map.md)/[Named Tuple](/docs/en/sql-reference/data-types/tuple.md) |
| `\x04` array | [Array](/docs/en/sql-reference/data-types/array.md)/[Tuple](/docs/en/sql-reference/data-types/tuple.md) |
| `\x05` binary, `\x00` binary subtype | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x05` binary, `\x02` old binary subtype | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x05` binary, `\x03` old uuid subtype | [UUID](/docs/en/sql-reference/data-types/uuid.md) |
| `\x05` binary, `\x04` uuid subtype | [UUID](/docs/en/sql-reference/data-types/uuid.md) |
| `\x07` ObjectId | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x08` boolean | [Bool](/docs/en/sql-reference/data-types/boolean.md) |
| `\x09` datetime | [DateTime64](/docs/en/sql-reference/data-types/datetime64.md) |
| `\x0A` null value | [NULL](/docs/en/sql-reference/data-types/nullable.md) |
| `\x0D` JavaScript code | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x0E` symbol | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x10` int32 | [Int32/UInt32](/docs/en/sql-reference/data-types/int-uint.md)/[Decimal32](/docs/en/sql-reference/data-types/decimal.md) |
| BSON Type | ClickHouse Type |
|------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `\x01` double | [Float32/Float64](/docs/en/sql-reference/data-types/float.md) |
| `\x02` string | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x03` document | [Map](/docs/en/sql-reference/data-types/map.md)/[Named Tuple](/docs/en/sql-reference/data-types/tuple.md) |
| `\x04` array | [Array](/docs/en/sql-reference/data-types/array.md)/[Tuple](/docs/en/sql-reference/data-types/tuple.md) |
| `\x05` binary, `\x00` binary subtype | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md)/[IPv6](/docs/en/sql-reference/data-types/domains/ipv6.md) |
| `\x05` binary, `\x02` old binary subtype | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x05` binary, `\x03` old uuid subtype | [UUID](/docs/en/sql-reference/data-types/uuid.md) |
| `\x05` binary, `\x04` uuid subtype | [UUID](/docs/en/sql-reference/data-types/uuid.md) |
| `\x07` ObjectId | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x08` boolean | [Bool](/docs/en/sql-reference/data-types/boolean.md) |
| `\x09` datetime | [DateTime64](/docs/en/sql-reference/data-types/datetime64.md) |
| `\x0A` null value | [NULL](/docs/en/sql-reference/data-types/nullable.md) |
| `\x0D` JavaScript code | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x0E` symbol | [String](/docs/en/sql-reference/data-types/string.md)/[FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| `\x10` int32 | [Int32/UInt32](/docs/en/sql-reference/data-types/int-uint.md)/[Decimal32](/docs/en/sql-reference/data-types/decimal.md)/[IPv4](/docs/en/sql-reference/data-types/domains/ipv4.md) |
| `\x12` int64 | [Int64/UInt64](/docs/en/sql-reference/data-types/int-uint.md)/[Decimal64](/docs/en/sql-reference/data-types/decimal.md)/[DateTime64](/docs/en/sql-reference/data-types/datetime64.md) |
Other BSON types are not supported. Also, it performs conversion between different integer types (for example, you can insert BSON int32 value into ClickHouse UInt8).
@ -1608,23 +1610,25 @@ See also [Format Schema](#formatschema).
The table below shows supported data types and how they match ClickHouse [data types](/docs/en/sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
| CapnProto data type (`INSERT`) | ClickHouse data type | CapnProto data type (`SELECT`) |
|--------------------------------|-----------------------------------------------------------|--------------------------------|
| `UINT8`, `BOOL` | [UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `UINT8` |
| `INT8` | [Int8](/docs/en/sql-reference/data-types/int-uint.md) | `INT8` |
| `UINT16` | [UInt16](/docs/en/sql-reference/data-types/int-uint.md), [Date](/docs/en/sql-reference/data-types/date.md) | `UINT16` |
| `INT16` | [Int16](/docs/en/sql-reference/data-types/int-uint.md) | `INT16` |
| `UINT32` | [UInt32](/docs/en/sql-reference/data-types/int-uint.md), [DateTime](/docs/en/sql-reference/data-types/datetime.md) | `UINT32` |
| `INT32` | [Int32](/docs/en/sql-reference/data-types/int-uint.md) | `INT32` |
| `UINT64` | [UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `UINT64` |
| `INT64` | [Int64](/docs/en/sql-reference/data-types/int-uint.md), [DateTime64](/docs/en/sql-reference/data-types/datetime.md) | `INT64` |
| `FLOAT32` | [Float32](/docs/en/sql-reference/data-types/float.md) | `FLOAT32` |
| `FLOAT64` | [Float64](/docs/en/sql-reference/data-types/float.md) | `FLOAT64` |
| `TEXT, DATA` | [String](/docs/en/sql-reference/data-types/string.md), [FixedString](/docs/en/sql-reference/data-types/fixedstring.md) | `TEXT, DATA` |
| `union(T, Void), union(Void, T)` | [Nullable(T)](/docs/en/sql-reference/data-types/date.md) | `union(T, Void), union(Void, T)` |
| `ENUM` | [Enum(8\|16)](/docs/en/sql-reference/data-types/enum.md) | `ENUM` |
| `LIST` | [Array](/docs/en/sql-reference/data-types/array.md) | `LIST` |
| `STRUCT` | [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `STRUCT` |
| CapnProto data type (`INSERT`) | ClickHouse data type | CapnProto data type (`SELECT`) |
|----------------------------------|------------------------------------------------------------------------------------------------------------------------|------------------------------|
| `UINT8`, `BOOL` | [UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `UINT8` |
| `INT8` | [Int8](/docs/en/sql-reference/data-types/int-uint.md) | `INT8` |
| `UINT16` | [UInt16](/docs/en/sql-reference/data-types/int-uint.md), [Date](/docs/en/sql-reference/data-types/date.md) | `UINT16` |
| `INT16` | [Int16](/docs/en/sql-reference/data-types/int-uint.md) | `INT16` |
| `UINT32` | [UInt32](/docs/en/sql-reference/data-types/int-uint.md), [DateTime](/docs/en/sql-reference/data-types/datetime.md) | `UINT32` |
| `INT32` | [Int32](/docs/en/sql-reference/data-types/int-uint.md) | `INT32` |
| `UINT64` | [UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `UINT64` |
| `INT64` | [Int64](/docs/en/sql-reference/data-types/int-uint.md), [DateTime64](/docs/en/sql-reference/data-types/datetime.md) | `INT64` |
| `FLOAT32` | [Float32](/docs/en/sql-reference/data-types/float.md) | `FLOAT32` |
| `FLOAT64` | [Float64](/docs/en/sql-reference/data-types/float.md) | `FLOAT64` |
| `TEXT, DATA` | [String](/docs/en/sql-reference/data-types/string.md), [FixedString](/docs/en/sql-reference/data-types/fixedstring.md) | `TEXT, DATA` |
| `union(T, Void), union(Void, T)` | [Nullable(T)](/docs/en/sql-reference/data-types/date.md) | `union(T, Void), union(Void, T)` |
| `ENUM` | [Enum(8\ |16)](/docs/en/sql-reference/data-types/enum.md) | `ENUM` |
| `LIST` | [Array](/docs/en/sql-reference/data-types/array.md) | `LIST` |
| `STRUCT` | [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `STRUCT` |
| `UINT32` | [IPv4](/docs/en/sql-reference/data-types/domains/ipv4.md) | `UINT32` |
| `DATA` | [IPv6](/docs/en/sql-reference/data-types/domains/ipv6.md) | `DATA` |
For working with `Enum` in CapnProto format use the [format_capn_proto_enum_comparising_mode](/docs/en/operations/settings/settings-formats.md/#format_capn_proto_enum_comparising_mode) setting.
@ -1804,21 +1808,23 @@ ClickHouse Avro format supports reading and writing [Avro data files](https://av
The table below shows supported data types and how they match ClickHouse [data types](/docs/en/sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
| Avro data type `INSERT` | ClickHouse data type | Avro data type `SELECT` |
|---------------------------------------------|----------------------------------------------------------------------------------------------------|------------------------------|
| `boolean`, `int`, `long`, `float`, `double` | [Int(8\|16\|32)](/docs/en/sql-reference/data-types/int-uint.md), [UInt(8\|16\|32)](/docs/en/sql-reference/data-types/int-uint.md) | `int` |
| `boolean`, `int`, `long`, `float`, `double` | [Int64](/docs/en/sql-reference/data-types/int-uint.md), [UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `long` |
| `boolean`, `int`, `long`, `float`, `double` | [Float32](/docs/en/sql-reference/data-types/float.md) | `float` |
| `boolean`, `int`, `long`, `float`, `double` | [Float64](/docs/en/sql-reference/data-types/float.md) | `double` |
| `bytes`, `string`, `fixed`, `enum` | [String](/docs/en/sql-reference/data-types/string.md) | `bytes` or `string` \* |
| `bytes`, `string`, `fixed` | [FixedString(N)](/docs/en/sql-reference/data-types/fixedstring.md) | `fixed(N)` |
| `enum` | [Enum(8\|16)](/docs/en/sql-reference/data-types/enum.md) | `enum` |
| `array(T)` | [Array(T)](/docs/en/sql-reference/data-types/array.md) | `array(T)` |
| `union(null, T)`, `union(T, null)` | [Nullable(T)](/docs/en/sql-reference/data-types/date.md) | `union(null, T)` |
| `null` | [Nullable(Nothing)](/docs/en/sql-reference/data-types/special-data-types/nothing.md) | `null` |
| `int (date)` \** | [Date](/docs/en/sql-reference/data-types/date.md) | `int (date)` \** |
| `long (timestamp-millis)` \** | [DateTime64(3)](/docs/en/sql-reference/data-types/datetime.md) | `long (timestamp-millis)` \* |
| `long (timestamp-micros)` \** | [DateTime64(6)](/docs/en/sql-reference/data-types/datetime.md) | `long (timestamp-micros)` \* |
| Avro data type `INSERT` | ClickHouse data type | Avro data type `SELECT` |
|---------------------------------------------|-----------------------------------------------------------------------------------------------------------------|-------------------------------------------------|
| `boolean`, `int`, `long`, `float`, `double` | [Int(8\ | 16\ |32)](/docs/en/sql-reference/data-types/int-uint.md), [UInt(8\|16\|32)](/docs/en/sql-reference/data-types/int-uint.md) | `int` |
| `boolean`, `int`, `long`, `float`, `double` | [Int64](/docs/en/sql-reference/data-types/int-uint.md), [UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `long` |
| `boolean`, `int`, `long`, `float`, `double` | [Float32](/docs/en/sql-reference/data-types/float.md) | `float` |
| `boolean`, `int`, `long`, `float`, `double` | [Float64](/docs/en/sql-reference/data-types/float.md) | `double` |
| `bytes`, `string`, `fixed`, `enum` | [String](/docs/en/sql-reference/data-types/string.md) | `bytes` or `string` \* |
| `bytes`, `string`, `fixed` | [FixedString(N)](/docs/en/sql-reference/data-types/fixedstring.md) | `fixed(N)` |
| `enum` | [Enum(8\ | 16)](/docs/en/sql-reference/data-types/enum.md) | `enum` |
| `array(T)` | [Array(T)](/docs/en/sql-reference/data-types/array.md) | `array(T)` |
| `union(null, T)`, `union(T, null)` | [Nullable(T)](/docs/en/sql-reference/data-types/date.md) | `union(null, T)` |
| `null` | [Nullable(Nothing)](/docs/en/sql-reference/data-types/special-data-types/nothing.md) | `null` |
| `int (date)` \** | [Date](/docs/en/sql-reference/data-types/date.md) | `int (date)` \** |
| `long (timestamp-millis)` \** | [DateTime64(3)](/docs/en/sql-reference/data-types/datetime.md) | `long (timestamp-millis)` \* |
| `long (timestamp-micros)` \** | [DateTime64(6)](/docs/en/sql-reference/data-types/datetime.md) | `long (timestamp-micros)` \* |
| `int` | [IPv4](/docs/en/sql-reference/data-types/domains/ipv4.md) | `int` |
| `fixed(16)` | [IPv6](/docs/en/sql-reference/data-types/domains/ipv6.md) | `fixed(16)` |
\* `bytes` is default, controlled by [output_format_avro_string_column_pattern](/docs/en/operations/settings/settings-formats.md/#output_format_avro_string_column_pattern)
\** [Avro logical types](https://avro.apache.org/docs/current/spec.html#Logical+Types)
@ -1918,28 +1924,30 @@ Setting `format_avro_schema_registry_url` needs to be configured in `users.xml`
The table below shows supported data types and how they match ClickHouse [data types](/docs/en/sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
| Parquet data type (`INSERT`) | ClickHouse data type | Parquet data type (`SELECT`) |
|-----------------------------------------------|-----------------------------------------------------------------|------------------------------|
| `BOOL` | [Bool](/docs/en/sql-reference/data-types/boolean.md) | `BOOL` |
| `UINT8`, `BOOL` | [UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `UINT8` |
| `INT8` | [Int8](/docs/en/sql-reference/data-types/int-uint.md) | `INT8` |
| `UINT16` | [UInt16](/docs/en/sql-reference/data-types/int-uint.md) | `UINT16` |
| `INT16` | [Int16](/docs/en/sql-reference/data-types/int-uint.md) | `INT16` |
| `UINT32` | [UInt32](/docs/en/sql-reference/data-types/int-uint.md) | `UINT32` |
| `INT32` | [Int32](/docs/en/sql-reference/data-types/int-uint.md) | `INT32` |
| `UINT64` | [UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `UINT64` |
| `INT64` | [Int64](/docs/en/sql-reference/data-types/int-uint.md) | `INT64` |
| `FLOAT` | [Float32](/docs/en/sql-reference/data-types/float.md) | `FLOAT` |
| `DOUBLE` | [Float64](/docs/en/sql-reference/data-types/float.md) | `DOUBLE` |
| `DATE` | [Date32](/docs/en/sql-reference/data-types/date.md) | `DATE` |
| `TIME (ms)` | [DateTime](/docs/en/sql-reference/data-types/datetime.md) | `UINT32` |
| `TIMESTAMP`, `TIME (us, ns)` | [DateTime64](/docs/en/sql-reference/data-types/datetime64.md) | `TIMESTAMP` |
| `STRING`, `BINARY` | [String](/docs/en/sql-reference/data-types/string.md) | `BINARY` |
| `STRING`, `BINARY`, `FIXED_LENGTH_BYTE_ARRAY` | [FixedString](/docs/en/sql-reference/data-types/fixedstring.md) | `FIXED_LENGTH_BYTE_ARRAY` |
| `DECIMAL` | [Decimal](/docs/en/sql-reference/data-types/decimal.md) | `DECIMAL` |
| `LIST` | [Array](/docs/en/sql-reference/data-types/array.md) | `LIST` |
| `STRUCT` | [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `STRUCT` |
| `MAP` | [Map](/docs/en/sql-reference/data-types/map.md) | `MAP` |
| Parquet data type (`INSERT`) | ClickHouse data type | Parquet data type (`SELECT`) |
|----------------------------------------------------|-----------------------------------------------------------------|------------------------------|
| `BOOL` | [Bool](/docs/en/sql-reference/data-types/boolean.md) | `BOOL` |
| `UINT8`, `BOOL` | [UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `UINT8` |
| `INT8` | [Int8](/docs/en/sql-reference/data-types/int-uint.md) | `INT8` |
| `UINT16` | [UInt16](/docs/en/sql-reference/data-types/int-uint.md) | `UINT16` |
| `INT16` | [Int16](/docs/en/sql-reference/data-types/int-uint.md) | `INT16` |
| `UINT32` | [UInt32](/docs/en/sql-reference/data-types/int-uint.md) | `UINT32` |
| `INT32` | [Int32](/docs/en/sql-reference/data-types/int-uint.md) | `INT32` |
| `UINT64` | [UInt64](/docs/en/sql-reference/data-types/int-uint.md) | `UINT64` |
| `INT64` | [Int64](/docs/en/sql-reference/data-types/int-uint.md) | `INT64` |
| `FLOAT` | [Float32](/docs/en/sql-reference/data-types/float.md) | `FLOAT` |
| `DOUBLE` | [Float64](/docs/en/sql-reference/data-types/float.md) | `DOUBLE` |
| `DATE` | [Date32](/docs/en/sql-reference/data-types/date.md) | `DATE` |
| `TIME (ms)` | [DateTime](/docs/en/sql-reference/data-types/datetime.md) | `UINT32` |
| `TIMESTAMP`, `TIME (us, ns)` | [DateTime64](/docs/en/sql-reference/data-types/datetime64.md) | `TIMESTAMP` |
| `STRING`, `BINARY` | [String](/docs/en/sql-reference/data-types/string.md) | `BINARY` |
| `STRING`, `BINARY`, `FIXED_LENGTH_BYTE_ARRAY` | [FixedString](/docs/en/sql-reference/data-types/fixedstring.md) | `FIXED_LENGTH_BYTE_ARRAY` |
| `DECIMAL` | [Decimal](/docs/en/sql-reference/data-types/decimal.md) | `DECIMAL` |
| `LIST` | [Array](/docs/en/sql-reference/data-types/array.md) | `LIST` |
| `STRUCT` | [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `STRUCT` |
| `MAP` | [Map](/docs/en/sql-reference/data-types/map.md) | `MAP` |
| `UINT32` | [IPv4](/docs/en/sql-reference/data-types/domains/ipv4.md) | `UINT32` |
| `FIXED_LENGTH_BYTE_ARRAY` | [IPv6](/docs/en/sql-reference/data-types/domains/ipv6.md) | `FIXED_LENGTH_BYTE_ARRAY` |
Arrays can be nested and can have a value of the `Nullable` type as an argument. `Tuple` and `Map` types also can be nested.
@ -2007,6 +2015,8 @@ The table below shows supported data types and how they match ClickHouse [data t
| `LIST` | [Array](/docs/en/sql-reference/data-types/array.md) | `LIST` |
| `STRUCT` | [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `STRUCT` |
| `MAP` | [Map](/docs/en/sql-reference/data-types/map.md) | `MAP` |
| `UINT32` | [IPv4](/docs/en/sql-reference/data-types/domains/ipv4.md) | `UINT32` |
| `FIXED_SIZE_BINARY`, `BINARY` | [IPv6](/docs/en/sql-reference/data-types/domains/ipv6.md) | `FIXED_SIZE_BINARY` |
Arrays can be nested and can have a value of the `Nullable` type as an argument. `Tuple` and `Map` types also can be nested.
@ -2054,8 +2064,8 @@ $ clickhouse-client --query="SELECT * FROM {some_table} FORMAT Arrow" > {filenam
The table below shows supported data types and how they match ClickHouse [data types](/docs/en/sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
| ORC data type (`INSERT`) | ClickHouse data type | ORC data type (`SELECT`) |
|---------------------------------------|---------------------------------------------------------|--------------------------|
| ORC data type (`INSERT`) | ClickHouse data type | ORC data type (`SELECT`) |
|---------------------------------------|---------------------------------------------------------------|--------------------------|
| `Boolean` | [UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `Boolean` |
| `Tinyint` | [Int8](/docs/en/sql-reference/data-types/int-uint.md) | `Tinyint` |
| `Smallint` | [Int16](/docs/en/sql-reference/data-types/int-uint.md) | `Smallint` |
@ -2070,6 +2080,7 @@ The table below shows supported data types and how they match ClickHouse [data t
| `List` | [Array](/docs/en/sql-reference/data-types/array.md) | `List` |
| `Struct` | [Tuple](/docs/en/sql-reference/data-types/tuple.md) | `Struct` |
| `Map` | [Map](/docs/en/sql-reference/data-types/map.md) | `Map` |
| `-` | [IPv4](/docs/en/sql-reference/data-types/int-uint.md) | `Int` |
Other types are not supported.
@ -2264,8 +2275,8 @@ ClickHouse supports reading and writing [MessagePack](https://msgpack.org/) data
### Data Types Matching {#data-types-matching-msgpack}
| MessagePack data type (`INSERT`) | ClickHouse data type | MessagePack data type (`SELECT`) |
|--------------------------------------------------------------------|-----------------------------------------------------------|------------------------------------|
| MessagePack data type (`INSERT`) | ClickHouse data type | MessagePack data type (`SELECT`) |
|--------------------------------------------------------------------|-----------------------------------------------------------------|------------------------------------|
| `uint N`, `positive fixint` | [UIntN](/docs/en/sql-reference/data-types/int-uint.md) | `uint N` |
| `int N`, `negative fixint` | [IntN](/docs/en/sql-reference/data-types/int-uint.md) | `int N` |
| `bool` | [UInt8](/docs/en/sql-reference/data-types/int-uint.md) | `uint 8` |
@ -2278,6 +2289,8 @@ ClickHouse supports reading and writing [MessagePack](https://msgpack.org/) data
| `uint 64` | [DateTime64](/docs/en/sql-reference/data-types/datetime.md) | `uint 64` |
| `fixarray`, `array 16`, `array 32` | [Array](/docs/en/sql-reference/data-types/array.md) | `fixarray`, `array 16`, `array 32` |
| `fixmap`, `map 16`, `map 32` | [Map](/docs/en/sql-reference/data-types/map.md) | `fixmap`, `map 16`, `map 32` |
| `uint 32` | [IPv4](/docs/en/sql-reference/data-types/domains/ipv4.md) | `uint 32` |
| `bin 8` | [String](/docs/en/sql-reference/data-types/string.md) | `bin 8` |
Example:

View File

@ -317,6 +317,7 @@ static bool checkCapnProtoType(const capnp::Type & capnp_type, const DataTypePtr
case TypeIndex::UInt16:
return capnp_type.isUInt16();
case TypeIndex::DateTime: [[fallthrough]];
case TypeIndex::IPv4: [[fallthrough]];
case TypeIndex::UInt32:
return capnp_type.isUInt32();
case TypeIndex::UInt64:
@ -355,6 +356,7 @@ static bool checkCapnProtoType(const capnp::Type & capnp_type, const DataTypePtr
case TypeIndex::LowCardinality:
return checkCapnProtoType(capnp_type, assert_cast<const DataTypeLowCardinality *>(data_type.get())->getDictionaryType(), mode, error_message, column_name);
case TypeIndex::FixedString: [[fallthrough]];
case TypeIndex::IPv6: [[fallthrough]];
case TypeIndex::String:
return capnp_type.isText() || capnp_type.isData();
default:

View File

@ -131,6 +131,7 @@ namespace
type_indexes.erase(TypeIndex::Date);
type_indexes.erase(TypeIndex::DateTime);
type_indexes.insert(TypeIndex::String);
return;
}

View File

@ -173,19 +173,19 @@ void chooseResultColumnType(
ErrorCodes::TYPE_MISMATCH,
"Automatically defined type {} for column '{}' in row {} differs from type defined by previous rows: {}. "
"You can specify the type for this column using setting schema_inference_hints",
type->getName(),
new_type->getName(),
column_name,
row,
new_type->getName());
type->getName());
else
throw Exception(
ErrorCodes::TYPE_MISMATCH,
"Automatically defined type {} for column '{}' in row {} differs from type defined by previous rows: {}. "
"Column types from setting schema_inference_hints couldn't be parsed because of error: {}",
type->getName(),
new_type->getName(),
column_name,
row,
new_type->getName(),
type->getName(),
hints_parsing_error);
}
}

View File

@ -17,6 +17,7 @@
#include <DataTypes/DataTypeDateTime64.h>
#include <DataTypes/DataTypeNothing.h>
#include <DataTypes/DataTypeFixedString.h>
#include <DataTypes/DataTypeIPv4andIPv6.h>
#include <Common/DateLUTImpl.h>
#include <base/types.h>
#include <Processors/Chunk.h>
@ -528,6 +529,38 @@ static std::shared_ptr<arrow::ChunkedArray> getNestedArrowColumn(std::shared_ptr
return std::make_shared<arrow::ChunkedArray>(array_vector);
}
static ColumnWithTypeAndName readIPv6ColumnFromBinaryData(std::shared_ptr<arrow::ChunkedArray> & arrow_column, const String & column_name)
{
size_t total_size = 0;
for (int chunk_i = 0, num_chunks = arrow_column->num_chunks(); chunk_i < num_chunks; ++chunk_i)
{
auto & chunk = dynamic_cast<arrow::BinaryArray &>(*(arrow_column->chunk(chunk_i)));
const size_t chunk_length = chunk.length();
for (size_t i = 0; i != chunk_length; ++i)
{
/// If at least one value size is not 16 bytes, fallback to reading String column and further cast to IPv6.
if (chunk.value_length(i) != sizeof(IPv6))
return readColumnWithStringData<arrow::BinaryArray>(arrow_column, column_name);
}
total_size += chunk_length;
}
auto internal_type = std::make_shared<DataTypeIPv6>();
auto internal_column = internal_type->createColumn();
auto & data = assert_cast<ColumnIPv6 &>(*internal_column).getData();
data.reserve(total_size * sizeof(IPv6));
for (int chunk_i = 0, num_chunks = arrow_column->num_chunks(); chunk_i < num_chunks; ++chunk_i)
{
auto & chunk = dynamic_cast<arrow::BinaryArray &>(*(arrow_column->chunk(chunk_i)));
std::shared_ptr<arrow::Buffer> buffer = chunk.value_data();
const auto * raw_data = reinterpret_cast<const IPv6 *>(buffer->data());
data.insert_assume_reserved(raw_data, raw_data + chunk.length());
}
return {std::move(internal_column), std::move(internal_type), column_name};
}
static ColumnWithTypeAndName readColumnFromArrowColumn(
std::shared_ptr<arrow::ChunkedArray> & arrow_column,
const std::string & column_name,
@ -559,7 +592,11 @@ static ColumnWithTypeAndName readColumnFromArrowColumn(
{
case arrow::Type::STRING:
case arrow::Type::BINARY:
{
if (type_hint && isIPv6(type_hint))
return readIPv6ColumnFromBinaryData(arrow_column, column_name);
return readColumnWithStringData<arrow::BinaryArray>(arrow_column, column_name);
}
case arrow::Type::FIXED_SIZE_BINARY:
return readColumnWithFixedStringData(arrow_column, column_name);
case arrow::Type::LARGE_BINARY:

View File

@ -145,6 +145,9 @@ static void insertNumber(IColumn & column, WhichDataType type, T value)
case TypeIndex::DateTime64:
assert_cast<ColumnDecimal<DateTime64> &>(column).insertValue(static_cast<Int64>(value));
break;
case TypeIndex::IPv4:
assert_cast<ColumnIPv4 &>(column).insertValue(IPv4(static_cast<UInt32>(value)));
break;
default:
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Type is not compatible with Avro");
}
@ -424,6 +427,15 @@ AvroDeserializer::DeserializeFn AvroDeserializer::createDeserializeFn(avro::Node
return true;
};
}
else if (target.isIPv6() && fixed_size == sizeof(IPv6))
{
return [tmp_fixed = std::vector<uint8_t>(fixed_size)](IColumn & column, avro::Decoder & decoder) mutable
{
decoder.decodeFixed(tmp_fixed.size(), tmp_fixed);
column.insertData(reinterpret_cast<const char *>(tmp_fixed.data()), tmp_fixed.size());
return true;
};
}
break;
}
case avro::AVRO_SYMBOLIC:

View File

@ -127,6 +127,11 @@ AvroSerializer::SchemaWithSerializeFn AvroSerializer::createSchemaWithSerializeF
{
encoder.encodeInt(assert_cast<const ColumnUInt32 &>(column).getElement(row_num));
}};
case TypeIndex::IPv4:
return {avro::IntSchema(), [](const IColumn & column, size_t row_num, avro::Encoder & encoder)
{
encoder.encodeInt(assert_cast<const ColumnIPv4 &>(column).getElement(row_num));
}};
case TypeIndex::Int32:
return {avro::IntSchema(), [](const IColumn & column, size_t row_num, avro::Encoder & encoder)
{
@ -205,6 +210,15 @@ AvroSerializer::SchemaWithSerializeFn AvroSerializer::createSchemaWithSerializeF
encoder.encodeFixed(reinterpret_cast<const uint8_t *>(s.data()), s.size());
}};
}
case TypeIndex::IPv6:
{
auto schema = avro::FixedSchema(sizeof(IPv6), "ipv6_" + toString(type_name_increment));
return {schema, [](const IColumn & column, size_t row_num, avro::Encoder & encoder)
{
const std::string_view & s = assert_cast<const ColumnIPv6 &>(column).getDataAt(row_num).toView();
encoder.encodeFixed(reinterpret_cast<const uint8_t *>(s.data()), s.size());
}};
}
case TypeIndex::Enum8:
{
auto schema = avro::EnumSchema("enum8_" + toString(type_name_increment)); /// type names must be different for different types.

View File

@ -151,6 +151,17 @@ static void readAndInsertInteger(ReadBuffer & in, IColumn & column, const DataTy
}
}
static void readAndInsertIPv4(ReadBuffer & in, IColumn & column, BSONType bson_type)
{
/// We expect BSON type Int32 as IPv4 value.
if (bson_type != BSONType::INT32)
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot insert BSON Int32 into column with type IPv4");
UInt32 value;
readBinary(value, in);
assert_cast<ColumnIPv4 &>(column).insertValue(IPv4(value));
}
template <typename T>
static void readAndInsertDouble(ReadBuffer & in, IColumn & column, const DataTypePtr & data_type, BSONType bson_type)
{
@ -296,37 +307,52 @@ static void readAndInsertString(ReadBuffer & in, IColumn & column, BSONType bson
}
}
static void readAndInsertIPv6(ReadBuffer & in, IColumn & column, BSONType bson_type)
{
if (bson_type != BSONType::BINARY)
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot insert BSON {} into IPv6 column", getBSONTypeName(bson_type));
auto size = readBSONSize(in);
auto subtype = getBSONBinarySubtype(readBSONType(in));
if (subtype != BSONBinarySubtype::BINARY)
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot insert BSON Binary subtype {} into IPv6 column", getBSONBinarySubtypeName(subtype));
if (size != sizeof(IPv6))
throw Exception(
ErrorCodes::INCORRECT_DATA,
"Cannot parse value of type IPv6, size of binary data is not equal to the binary size of IPv6 value: {} != {}",
size,
sizeof(IPv6));
IPv6 value;
readBinary(value, in);
assert_cast<ColumnIPv6 &>(column).insertValue(value);
}
static void readAndInsertUUID(ReadBuffer & in, IColumn & column, BSONType bson_type)
{
if (bson_type == BSONType::BINARY)
{
auto size = readBSONSize(in);
auto subtype = getBSONBinarySubtype(readBSONType(in));
if (subtype == BSONBinarySubtype::UUID || subtype == BSONBinarySubtype::UUID_OLD)
{
if (size != sizeof(UUID))
throw Exception(
ErrorCodes::INCORRECT_DATA,
"Cannot parse value of type UUID, size of binary data is not equal to the binary size of UUID value: {} != {}",
size,
sizeof(UUID));
UUID value;
readBinary(value, in);
assert_cast<ColumnUUID &>(column).insertValue(value);
}
else
{
throw Exception(
ErrorCodes::ILLEGAL_COLUMN,
"Cannot insert BSON Binary subtype {} into UUID column",
getBSONBinarySubtypeName(subtype));
}
}
else
{
if (bson_type != BSONType::BINARY)
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot insert BSON {} into UUID column", getBSONTypeName(bson_type));
}
auto size = readBSONSize(in);
auto subtype = getBSONBinarySubtype(readBSONType(in));
if (subtype != BSONBinarySubtype::UUID && subtype != BSONBinarySubtype::UUID_OLD)
throw Exception(
ErrorCodes::ILLEGAL_COLUMN,
"Cannot insert BSON Binary subtype {} into UUID column",
getBSONBinarySubtypeName(subtype));
if (size != sizeof(UUID))
throw Exception(
ErrorCodes::INCORRECT_DATA,
"Cannot parse value of type UUID, size of binary data is not equal to the binary size of UUID value: {} != {}",
size,
sizeof(UUID));
UUID value;
readBinary(value, in);
assert_cast<ColumnUUID &>(column).insertValue(value);
}
void BSONEachRowRowInputFormat::readArray(IColumn & column, const DataTypePtr & data_type, BSONType bson_type)
@ -591,6 +617,16 @@ bool BSONEachRowRowInputFormat::readField(IColumn & column, const DataTypePtr &
readAndInsertString<false>(*in, column, bson_type);
return true;
}
case TypeIndex::IPv4:
{
readAndInsertIPv4(*in, column, bson_type);
return true;
}
case TypeIndex::IPv6:
{
readAndInsertIPv6(*in, column, bson_type);
return true;
}
case TypeIndex::UUID:
{
readAndInsertUUID(*in, column, bson_type);

View File

@ -124,6 +124,7 @@ size_t BSONEachRowRowOutputFormat::countBSONFieldSize(const IColumn & column, co
case TypeIndex::Date: [[fallthrough]];
case TypeIndex::Date32: [[fallthrough]];
case TypeIndex::Decimal32: [[fallthrough]];
case TypeIndex::IPv4: [[fallthrough]];
case TypeIndex::Int32:
{
return size + sizeof(Int32);
@ -168,6 +169,10 @@ size_t BSONEachRowRowOutputFormat::countBSONFieldSize(const IColumn & column, co
const auto & string_column = assert_cast<const ColumnFixedString &>(column);
return size + sizeof(BSONSizeT) + string_column.getN() + 1; // Size of data + data + \0 or BSON subtype (in case of BSON binary)
}
case TypeIndex::IPv6:
{
return size + sizeof(BSONSizeT) + 1 + sizeof(IPv6); // Size of data + BSON binary subtype + 16 bytes of value
}
case TypeIndex::UUID:
{
return size + sizeof(BSONSizeT) + 1 + sizeof(UUID); // Size of data + BSON binary subtype + 16 bytes of value
@ -371,6 +376,19 @@ void BSONEachRowRowOutputFormat::serializeField(const IColumn & column, const Da
writeBSONString<ColumnFixedString>(column, row_num, name, out, settings.bson.output_string_as_string);
break;
}
case TypeIndex::IPv4:
{
writeBSONNumber<ColumnIPv4, Int32>(BSONType::INT32, column, row_num, name, out);
break;
}
case TypeIndex::IPv6:
{
writeBSONTypeAndKeyName(BSONType::BINARY, name, out);
writeBSONSize(sizeof(IPv6), out);
writeBSONType(BSONBinarySubtype::BINARY, out);
writeBinary(assert_cast<const ColumnIPv6 &>(column).getElement(row_num), out);
break;
}
case TypeIndex::UUID:
{
writeBSONTypeAndKeyName(BSONType::BINARY, name, out);

View File

@ -434,6 +434,46 @@ namespace DB
checkStatus(status, write_column->getName(), format_name);
}
static void fillArrowArrayWithIPv6ColumnData(
ColumnPtr write_column,
const PaddedPODArray<UInt8> * null_bytemap,
const String & format_name,
arrow::ArrayBuilder* array_builder,
size_t start,
size_t end)
{
const auto & internal_column = assert_cast<const ColumnIPv6 &>(*write_column);
const auto & internal_data = internal_column.getData();
size_t fixed_length = sizeof(IPv6);
arrow::FixedSizeBinaryBuilder & builder = assert_cast<arrow::FixedSizeBinaryBuilder &>(*array_builder);
arrow::Status status;
PaddedPODArray<UInt8> arrow_null_bytemap = revertNullByteMap(null_bytemap, start, end);
const UInt8 * arrow_null_bytemap_raw_ptr = arrow_null_bytemap.empty() ? nullptr : arrow_null_bytemap.data();
const uint8_t * data_start = reinterpret_cast<const uint8_t *>(internal_data.data()) + start * fixed_length;
status = builder.AppendValues(data_start, end - start, reinterpret_cast<const uint8_t *>(arrow_null_bytemap_raw_ptr));
checkStatus(status, write_column->getName(), format_name);
}
static void fillArrowArrayWithIPv4ColumnData(
ColumnPtr write_column,
const PaddedPODArray<UInt8> * null_bytemap,
const String & format_name,
arrow::ArrayBuilder* array_builder,
size_t start,
size_t end)
{
const auto & internal_data = assert_cast<const ColumnIPv4 &>(*write_column).getData();
auto & builder = assert_cast<arrow::UInt32Builder &>(*array_builder);
arrow::Status status;
PaddedPODArray<UInt8> arrow_null_bytemap = revertNullByteMap(null_bytemap, start, end);
const UInt8 * arrow_null_bytemap_raw_ptr = arrow_null_bytemap.empty() ? nullptr : arrow_null_bytemap.data();
status = builder.AppendValues(&(internal_data.data() + start)->toUnderType(), end - start, reinterpret_cast<const uint8_t *>(arrow_null_bytemap_raw_ptr));
checkStatus(status, write_column->getName(), format_name);
}
static void fillArrowArrayWithDateColumnData(
ColumnPtr write_column,
const PaddedPODArray<UInt8> * null_bytemap,
@ -541,6 +581,14 @@ namespace DB
else
fillArrowArrayWithStringColumnData<ColumnFixedString, arrow::BinaryBuilder>(column, null_bytemap, format_name, array_builder, start, end);
}
else if (isIPv6(column_type))
{
fillArrowArrayWithIPv6ColumnData(column, null_bytemap, format_name, array_builder, start, end);
}
else if (isIPv4(column_type))
{
fillArrowArrayWithIPv4ColumnData(column, null_bytemap, format_name, array_builder, start, end);
}
else if (isDate(column_type))
{
fillArrowArrayWithDateColumnData(column, null_bytemap, format_name, array_builder, start, end);
@ -781,6 +829,12 @@ namespace DB
if (isBool(column_type))
return arrow::boolean();
if (isIPv6(column_type))
return arrow::fixed_size_binary(sizeof(IPv6));
if (isIPv4(column_type))
return arrow::uint32();
const std::string type_name = column_type->getFamilyName();
if (const auto * arrow_type_it = std::find_if(
internal_type_to_arrow_type.begin(),

View File

@ -128,6 +128,9 @@ static void insertUnsignedInteger(IColumn & column, const DataTypePtr & column_t
case TypeIndex::UInt64:
assert_cast<ColumnUInt64 &>(column).insertValue(value);
break;
case TypeIndex::IPv4:
assert_cast<ColumnIPv4 &>(column).insertValue(IPv4(static_cast<UInt32>(value)));
break;
default:
throw Exception(ErrorCodes::LOGICAL_ERROR, "Column type is not an unsigned integer.");
}

View File

@ -111,7 +111,12 @@ static std::optional<capnp::DynamicValue::Reader> convertToDynamicValue(
case capnp::DynamicValue::Type::INT:
return capnp::DynamicValue::Reader(column->getInt(row_num));
case capnp::DynamicValue::Type::UINT:
{
/// IPv4 column doesn't support getUInt method.
if (isIPv4(data_type))
return capnp::DynamicValue::Reader(assert_cast<const ColumnIPv4 *>(column.get())->getElement(row_num));
return capnp::DynamicValue::Reader(column->getUInt(row_num));
}
case capnp::DynamicValue::Type::BOOL:
return capnp::DynamicValue::Reader(column->getBool(row_num));
case capnp::DynamicValue::Type::FLOAT:

View File

@ -162,6 +162,11 @@ static void insertInteger(IColumn & column, DataTypePtr type, UInt64 value)
assert_cast<DataTypeDateTime64::ColumnType &>(column).insertValue(value);
break;
}
case TypeIndex::IPv4:
{
assert_cast<ColumnIPv4 &>(column).insertValue(IPv4(static_cast<UInt32>(value)));
break;
}
default:
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot insert MessagePack integer into column with type {}.", type->getName());
}
@ -190,6 +195,12 @@ static void insertString(IColumn & column, DataTypePtr type, const char * value,
return;
}
if (isIPv6(type) && bin)
{
assert_cast<ColumnIPv6 &>(column).insertData(value, size);
return;
}
if (!isStringOrFixedString(type))
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot insert MessagePack string into column with type {}.", type->getName());

View File

@ -56,6 +56,11 @@ void MsgPackRowOutputFormat::serializeField(const IColumn & column, DataTypePtr
packer.pack_uint32(assert_cast<const ColumnUInt32 &>(column).getElement(row_num));
return;
}
case TypeIndex::IPv4:
{
packer.pack_uint32(assert_cast<const ColumnIPv4 &>(column).getElement(row_num));
return;
}
case TypeIndex::UInt64:
{
packer.pack_uint64(assert_cast<const ColumnUInt64 &>(column).getElement(row_num));
@ -110,6 +115,13 @@ void MsgPackRowOutputFormat::serializeField(const IColumn & column, DataTypePtr
packer.pack_bin_body(string.data(), static_cast<unsigned>(string.size()));
return;
}
case TypeIndex::IPv6:
{
const std::string_view & data = assert_cast<const ColumnIPv6 &>(column).getDataAt(row_num).toView();
packer.pack_bin(static_cast<unsigned>(data.size()));
packer.pack_bin_body(data.data(), static_cast<unsigned>(data.size()));
return;
}
case TypeIndex::Array:
{
auto nested_type = assert_cast<const DataTypeArray &>(*data_type).getNestedType();

View File

@ -75,6 +75,7 @@ std::unique_ptr<orc::Type> ORCBlockOutputFormat::getORCType(const DataTypePtr &
return orc::createPrimitiveType(orc::TypeKind::SHORT);
}
case TypeIndex::UInt32: [[fallthrough]];
case TypeIndex::IPv4: [[fallthrough]];
case TypeIndex::Int32:
{
return orc::createPrimitiveType(orc::TypeKind::INT);
@ -109,6 +110,10 @@ std::unique_ptr<orc::Type> ORCBlockOutputFormat::getORCType(const DataTypePtr &
return orc::createPrimitiveType(orc::TypeKind::STRING);
return orc::createPrimitiveType(orc::TypeKind::BINARY);
}
case TypeIndex::IPv6:
{
return orc::createPrimitiveType(orc::TypeKind::BINARY);
}
case TypeIndex::Nullable:
{
return getORCType(removeNullable(type));
@ -309,6 +314,11 @@ void ORCBlockOutputFormat::writeColumn(
writeNumbers<UInt32, orc::LongVectorBatch>(orc_column, column, null_bytemap, [](const UInt32 & value){ return value; });
break;
}
case TypeIndex::IPv4:
{
writeNumbers<IPv4, orc::LongVectorBatch>(orc_column, column, null_bytemap, [](const IPv4 & value){ return value.toUnderType(); });
break;
}
case TypeIndex::Int64:
{
writeNumbers<Int64, orc::LongVectorBatch>(orc_column, column, null_bytemap, [](const Int64 & value){ return value; });
@ -339,6 +349,11 @@ void ORCBlockOutputFormat::writeColumn(
writeStrings<ColumnString>(orc_column, column, null_bytemap);
break;
}
case TypeIndex::IPv6:
{
writeStrings<ColumnIPv6>(orc_column, column, null_bytemap);
break;
}
case TypeIndex::DateTime:
{
writeDateTimes<ColumnUInt32>(

View File

@ -0,0 +1,18 @@
CapnProto
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
Avro
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
Arrow
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
Parquet
ipv6 Nullable(FixedString(16))
ipv4 Nullable(Int64)
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
ORC
ipv6 Nullable(String)
ipv4 Nullable(Int32)
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
BSONEachRow
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1
MsgPack
2001:db8:11a3:9d7:1f34:8a2e:7a0:765d 127.0.0.1

View File

@ -0,0 +1,45 @@
#!/usr/bin/env bash
# Tags: no-fasttest, no-parallel
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CURDIR"/../shell_config.sh
echo "CapnProto"
${CLICKHOUSE_LOCAL} -q "select '2001:db8:11a3:9d7:1f34:8a2e:7a0:765d'::IPv6 as ipv6, '127.0.0.1'::IPv4 as ipv4 format CapnProto settings format_schema='$CURDIR/format_schemas/02566_ipv4_ipv6:Message'" > 02566_ipv4_ipv6_data.capnp
${CLICKHOUSE_LOCAL} -q "select * from file(02566_ipv4_ipv6_data.capnp, auto, 'ipv6 IPv6, ipv4 IPv4') settings format_schema='$CURDIR/format_schemas/02566_ipv4_ipv6:Message'"
rm 02566_ipv4_ipv6_data.capnp
echo "Avro"
${CLICKHOUSE_LOCAL} -q "select '2001:db8:11a3:9d7:1f34:8a2e:7a0:765d'::IPv6 as ipv6, '127.0.0.1'::IPv4 as ipv4 format Avro" > 02566_ipv4_ipv6_data.avro
${CLICKHOUSE_LOCAL} -q "select * from file(02566_ipv4_ipv6_data.avro, auto, 'ipv6 IPv6, ipv4 IPv4')"
rm 02566_ipv4_ipv6_data.avro
echo "Arrow"
${CLICKHOUSE_LOCAL} -q "select '2001:db8:11a3:9d7:1f34:8a2e:7a0:765d'::IPv6 as ipv6, '127.0.0.1'::IPv4 as ipv4 format Arrow" > 02566_ipv4_ipv6_data.arrow
${CLICKHOUSE_LOCAL} -q "select * from file(02566_ipv4_ipv6_data.arrow, auto, 'ipv6 IPv6, ipv4 IPv4')"
rm 02566_ipv4_ipv6_data.arrow
echo "Parquet"
${CLICKHOUSE_LOCAL} -q "select '2001:db8:11a3:9d7:1f34:8a2e:7a0:765d'::IPv6 as ipv6, '127.0.0.1'::IPv4 as ipv4 format Parquet" > 02566_ipv4_ipv6_data.parquet
${CLICKHOUSE_LOCAL} -q "desc file(02566_ipv4_ipv6_data.parquet)"
${CLICKHOUSE_LOCAL} -q "select ipv6, toIPv4(ipv4) from file(02566_ipv4_ipv6_data.parquet, auto, 'ipv6 IPv6, ipv4 UInt32')"
rm 02566_ipv4_ipv6_data.parquet
echo "ORC"
${CLICKHOUSE_LOCAL} -q "select '2001:db8:11a3:9d7:1f34:8a2e:7a0:765d'::IPv6 as ipv6, '127.0.0.1'::IPv4 as ipv4 format ORC" > 02566_ipv4_ipv6_data.orc
${CLICKHOUSE_LOCAL} -q "desc file(02566_ipv4_ipv6_data.orc)"
${CLICKHOUSE_LOCAL} -q "select ipv6, toIPv4(ipv4) from file(02566_ipv4_ipv6_data.orc, auto, 'ipv6 IPv6, ipv4 UInt32')"
rm 02566_ipv4_ipv6_data.orc
echo "BSONEachRow"
${CLICKHOUSE_LOCAL} -q "select '2001:db8:11a3:9d7:1f34:8a2e:7a0:765d'::IPv6 as ipv6, '127.0.0.1'::IPv4 as ipv4 format BSONEachRow" > 02566_ipv4_ipv6_data.bson
${CLICKHOUSE_LOCAL} -q "select * from file(02566_ipv4_ipv6_data.bson, auto, 'ipv6 IPv6, ipv4 IPv4')"
rm 02566_ipv4_ipv6_data.bson
echo "MsgPack"
${CLICKHOUSE_LOCAL} -q "select '2001:db8:11a3:9d7:1f34:8a2e:7a0:765d'::IPv6 as ipv6, '127.0.0.1'::IPv4 as ipv4 format MsgPack" > 02566_ipv4_ipv6_data.msgpack
${CLICKHOUSE_LOCAL} -q "select * from file(02566_ipv4_ipv6_data.msgpack, auto, 'ipv6 IPv6, ipv4 IPv4')"
rm 02566_ipv4_ipv6_data.msgpack

View File

@ -0,0 +1 @@
x Nullable(String)

View File

@ -0,0 +1,2 @@
desc format(JSONEachRow, '{"x" : "2020-01-01"}, {"x" : "1000"}')

View File

@ -0,0 +1,6 @@
@0xb6ecde1cd54a101d;
struct Message {
ipv4 @0 :UInt32;
ipv6 @1 :Data;
}

View File

@ -1,7 +1,11 @@
v23.2.3.17-stable 2023-03-06
v23.2.2.20-stable 2023-03-01
v23.2.1.2537-stable 2023-02-23
v23.1.4.58-stable 2023-03-01
v23.1.3.5-stable 2023-02-03
v23.1.2.9-stable 2023-01-29
v23.1.1.3077-stable 2023-01-25
v22.12.4.76-stable 2023-03-01
v22.12.3.5-stable 2023-01-10
v22.12.2.25-stable 2023-01-06
v22.12.1.1752-stable 2022-12-15
@ -25,6 +29,7 @@ v22.9.4.32-stable 2022-10-26
v22.9.3.18-stable 2022-09-30
v22.9.2.7-stable 2022-09-23
v22.9.1.2603-stable 2022-09-22
v22.8.14.53-lts 2023-02-27
v22.8.13.20-lts 2023-01-29
v22.8.12.45-lts 2023-01-10
v22.8.11.15-lts 2022-12-08

1 v23.2.1.2537-stable v23.2.3.17-stable 2023-02-23 2023-03-06
1 v23.2.3.17-stable 2023-03-06
2 v23.2.2.20-stable 2023-03-01
3 v23.2.1.2537-stable v23.2.1.2537-stable 2023-02-23 2023-02-23
4 v23.1.4.58-stable 2023-03-01
5 v23.1.3.5-stable v23.1.3.5-stable 2023-02-03 2023-02-03
6 v23.1.2.9-stable v23.1.2.9-stable 2023-01-29 2023-01-29
7 v23.1.1.3077-stable v23.1.1.3077-stable 2023-01-25 2023-01-25
8 v22.12.4.76-stable 2023-03-01
9 v22.12.3.5-stable v22.12.3.5-stable 2023-01-10 2023-01-10
10 v22.12.2.25-stable v22.12.2.25-stable 2023-01-06 2023-01-06
11 v22.12.1.1752-stable v22.12.1.1752-stable 2022-12-15 2022-12-15
29 v22.9.3.18-stable v22.9.3.18-stable 2022-09-30 2022-09-30
30 v22.9.2.7-stable v22.9.2.7-stable 2022-09-23 2022-09-23
31 v22.9.1.2603-stable v22.9.1.2603-stable 2022-09-22 2022-09-22
32 v22.8.14.53-lts 2023-02-27
33 v22.8.13.20-lts v22.8.13.20-lts 2023-01-29 2023-01-29
34 v22.8.12.45-lts v22.8.12.45-lts 2023-01-10 2023-01-10
35 v22.8.11.15-lts v22.8.11.15-lts 2022-12-08 2022-12-08