Merge pull request #45127 from DanRoscigno/add-deltalake-docs

Add deltalake docs
This commit is contained in:
Dan Roscigno 2023-01-11 08:07:42 -05:00 committed by GitHub
commit 0ad969171e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 84 additions and 0 deletions

View File

@ -0,0 +1,33 @@
---
slug: /en/engines/table-engines/integrations/deltalake
sidebar_label: DeltaLake
---
# DeltaLake Table Engine
This engine provides a read-only integration with existing [Delta Lake](https://github.com/delta-io/delta) tables in Amazon S3.
## Create Table
Note that the Delta Lake table must already exist in S3, this command does not take DDL parameters to create a new table.
``` sql
CREATE TABLE deltalake
ENGINE = DeltaLake(url, [aws_access_key_id, aws_secret_access_key,])
```
**Engine parameters**
- `url` — Bucket url with path to the existing Delta Lake table.
- `aws_access_key_id`, `aws_secret_access_key` - Long-term credentials for the [AWS](https://aws.amazon.com/) account user. You can use these to authenticate your requests. Parameter is optional. If credentials are not specified, they are used from the configuration file. For more information see [Using S3 for Data Storage](../mergetree-family/mergetree.md#table_engine-mergetree-s3).
**Example**
```sql
CREATE TABLE deltalake ENGINE=DeltaLake('http://mars-doc-test.s3.amazonaws.com/clickhouse-bucket-3/test_table/', 'ABC123', 'Abc+123')
```
## See also
- [deltaLake table function](../../../sql-reference/table-functions/deltalake.md)

View File

@ -0,0 +1,51 @@
---
slug: /en/sql-reference/table-functions/deltalake
sidebar_label: DeltaLake
---
# deltaLake Table Function
Provides a read-only table-like interface to [Delta Lake](https://github.com/delta-io/delta) tables in Amazon S3.
## Syntax
``` sql
deltaLake(url [,aws_access_key_id, aws_secret_access_key] [,format] [,structure] [,compression])
```
## Arguments
- `url` — Bucket url with path to existing Delta Lake table in S3.
- `aws_access_key_id`, `aws_secret_access_key` - Long-term credentials for the [AWS](https://aws.amazon.com/) account user. You can use these to authenticate your requests. These parameters are optional. If credentials are not specified, they are used from the ClickHouse configuration. For more information see [Using S3 for Data Storage](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-s3).
- `format` — The [format](/docs/en/interfaces/formats.md/#formats) of the file.
- `structure` — Structure of the table. Format `'column1_name column1_type, column2_name column2_type, ...'`.
- `compression` — Parameter is optional. Supported values: `none`, `gzip/gz`, `brotli/br`, `xz/LZMA`, `zstd/zst`. By default, compression will be autodetected by the file extension.
**Returned value**
A table with the specified structure for reading data in the specified Delta Lake table in S3.
**Examples**
Selecting rows from the table in S3 `https://clickhouse-public-datasets.s3.amazonaws.com/delta_lake/hits/`:
``` sql
SELECT
URL,
UserAgent
FROM deltaLake('https://clickhouse-public-datasets.s3.amazonaws.com/delta_lake/hits/')
WHERE URL IS NOT NULL
LIMIT 2
```
``` response
┌─URL───────────────────────────────────────────────────────────────────┬─UserAgent─┐
│ http://auto.ria.ua/search/index.kz/jobinmoscow/detail/55089/hasimages │ 1 │
│ http://auto.ria.ua/search/index.kz/jobinmoscow.ru/gosushi │ 1 │
└───────────────────────────────────────────────────────────────────────┴───────────┘
```
**See Also**
- [DeltaLake engine](/docs/en/engines/table-engines/integrations/deltalake.md)