ClickHouse/docs/en/sql-reference/table-functions/iceberg.md

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

84 lines
3.0 KiB
Markdown
Raw Normal View History

2023-02-22 11:01:18 +00:00
---
slug: /en/sql-reference/table-functions/iceberg
sidebar_position: 90
sidebar_label: iceberg
2023-02-22 11:01:18 +00:00
---
# iceberg Table Function
2024-10-02 11:15:16 +00:00
Provides a read-only table-like interface to Apache [Iceberg](https://iceberg.apache.org/) tables in Amazon S3, Azure, HDFS or locally stored.
2023-02-22 11:01:18 +00:00
## Syntax
``` sql
2024-08-12 13:45:00 +00:00
icebergS3(url [, NOSIGN | access_key_id, secret_access_key, [session_token]] [,format] [,compression_method])
icebergS3(named_collection[, option=value [,..]])
icebergAzure(connection_string|storage_account_url, container_name, blobpath, [,account_name], [,account_key] [,format] [,compression_method])
icebergAzure(named_collection[, option=value [,..]])
2024-10-02 11:15:16 +00:00
icebergHDFS(path_to_table, [,format] [,compression_method])
icebergHDFS(named_collection[, option=value [,..]])
2024-08-12 13:45:00 +00:00
icebergLocal(path_to_table, [,format] [,compression_method])
icebergLocal(named_collection[, option=value [,..]])
2023-02-22 11:01:18 +00:00
```
## Arguments
2024-10-02 11:15:16 +00:00
Description of the arguments coincides with description of arguments in table functions `s3`, `azureBlobStorage`, `HDFS` and `file` correspondingly.
2024-08-12 13:45:00 +00:00
`format` stands for the format of data files in the Iceberg table.
2023-02-22 11:01:18 +00:00
**Returned value**
2024-08-12 13:45:00 +00:00
A table with the specified structure for reading data in the specified Iceberg table.
2023-02-22 11:01:18 +00:00
**Example**
```sql
2024-08-12 13:45:00 +00:00
SELECT * FROM icebergS3('http://test.s3.amazonaws.com/clickhouse-bucket/test_table', 'test', 'test')
2023-02-22 11:01:18 +00:00
```
2023-03-27 18:54:05 +00:00
:::important
2024-10-02 11:15:16 +00:00
ClickHouse currently supports reading v1 and v2 of the Iceberg format via the `icebergS3`, `icebergAzure`, `icebergHDFS` and `icebergLocal` table functions and `IcebergS3`, `icebergAzure`, `IcebergHDFS` and `IcebergLocal` table engines.
2023-03-21 17:17:04 +00:00
:::
## Defining a named collection
Here is an example of configuring a named collection for storing the URL and credentials:
2023-02-22 11:01:18 +00:00
```xml
<clickhouse>
<named_collections>
<iceberg_conf>
<url>http://test.s3.amazonaws.com/clickhouse-bucket/</url>
<access_key_id>test<access_key_id>
<secret_access_key>test</secret_access_key>
<format>auto</format>
<structure>auto</structure>
</iceberg_conf>
</named_collections>
</clickhouse>
```
```sql
2024-08-12 13:45:00 +00:00
SELECT * FROM icebergS3(iceberg_conf, filename = 'test_table')
DESCRIBE icebergS3(iceberg_conf, filename = 'test_table')
2023-02-22 11:01:18 +00:00
```
2024-09-11 15:02:25 +00:00
**Schema Evolution**
At the moment, with the help of CH, you can read iceberg tables, the schema of which has changed over time. We currently support reading tables where columns have been added and removed, and their order has changed. You can also change a column where a value is required to one where NULL is allowed. Additionally, we support permitted type casting for simple types, namely:  
* int -> long
* float -> double
* decimal(P, S) -> decimal(P', S) where P' > P.
Currently, it is not possible to change nested structures or the types of elements within arrays and maps.
2024-08-12 13:45:00 +00:00
**Aliases**
Table function `iceberg` is an alias to `icebergS3` now.
2023-02-22 11:01:18 +00:00
**See Also**
- [Iceberg engine](/docs/en/engines/table-engines/integrations/iceberg.md)
2024-11-19 12:49:59 +00:00
- [Iceberg cluster table function](/docs/en/sql-reference/table-functions/icebergCluster.md)