Merge pull request #45978 from DanRoscigno/add-gcs-syntax-for-s3-function

add notes for GCS to s3 table function
This commit is contained in:
Dan Roscigno 2023-02-02 13:55:08 -05:00 committed by GitHub
commit 440f14260c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -2,11 +2,12 @@
slug: /en/sql-reference/table-functions/s3 slug: /en/sql-reference/table-functions/s3
sidebar_position: 45 sidebar_position: 45
sidebar_label: s3 sidebar_label: s3
keywords: [s3, gcs, bucket]
--- ---
# s3 Table Function # s3 Table Function
Provides table-like interface to select/insert files in [Amazon S3](https://aws.amazon.com/s3/). This table function is similar to [hdfs](../../sql-reference/table-functions/hdfs.md), but provides S3-specific features. Provides a table-like interface to select/insert files in [Amazon S3](https://aws.amazon.com/s3/) and [Google Cloud Storage](https://cloud.google.com/storage/). This table function is similar to the [hdfs function](../../sql-reference/table-functions/hdfs.md), but provides S3-specific features.
**Syntax** **Syntax**
@ -14,9 +15,24 @@ Provides table-like interface to select/insert files in [Amazon S3](https://aws.
s3(path [,aws_access_key_id, aws_secret_access_key] [,format] [,structure] [,compression]) s3(path [,aws_access_key_id, aws_secret_access_key] [,format] [,structure] [,compression])
``` ```
:::tip GCS
The S3 Table Function integrates with Google Cloud Storage by using the GCS XML API and HMAC keys. See the [Google interoperability docs]( https://cloud.google.com/storage/docs/interoperability) for more details about the endpoint and HMAC.
For GCS, substitute your HMAC key and HMAC secret where you see `aws_access_key_id` and `aws_secret_access_key`.
:::
**Arguments** **Arguments**
- `path` — Bucket url with path to file. Supports following wildcards in readonly mode: `*`, `?`, `{abc,def}` and `{N..M}` where `N`, `M` — numbers, `'abc'`, `'def'` — strings. For more information see [here](../../engines/table-engines/integrations/s3.md#wildcards-in-path). - `path` — Bucket url with path to file. Supports following wildcards in readonly mode: `*`, `?`, `{abc,def}` and `{N..M}` where `N`, `M` — numbers, `'abc'`, `'def'` — strings. For more information see [here](../../engines/table-engines/integrations/s3.md#wildcards-in-path).
:::note GCS
The GCS path is in this format as the endpoint for the Google XML API is different than the JSON API:
```
https://storage.googleapis.com/<bucket>/<folder>/<filename(s)>
```
and not ~~https://storage.cloud.google.com~~.
:::
- `format` — The [format](../../interfaces/formats.md#formats) of the file. - `format` — The [format](../../interfaces/formats.md#formats) of the file.
- `structure` — Structure of the table. Format `'column1_name column1_type, column2_name column2_type, ...'`. - `structure` — Structure of the table. Format `'column1_name column1_type, column2_name column2_type, ...'`.
- `compression` — Parameter is optional. Supported values: `none`, `gzip/gz`, `brotli/br`, `xz/LZMA`, `zstd/zst`. By default, it will autodetect compression by file extension. - `compression` — Parameter is optional. Supported values: `none`, `gzip/gz`, `brotli/br`, `xz/LZMA`, `zstd/zst`. By default, it will autodetect compression by file extension.