Merge pull request #28716 from olgarev/revolg-DOCSUP-13742-partitions_in_s3_table_function

2024-11-25 00:52:02 +00:00 · 2021-09-10 17:57:58 +03:00 · 2021-09-10 17:57:58 +03:00 · 5b967d91ba
commit 5b967d91ba
parent 0bb74f8eaf 27cd75eaa1
4 changed files with 51 additions and 3 deletions
--- a/docs/en/engines/table-engines/integrations/s3.md
+++ b/docs/en/engines/table-engines/integrations/s3.md
@ -210,4 +210,4 @@ ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/big_prefix/file-
 ## See also
-  [S3 table function](../../../sql-reference/table-functions/s3.md)
+-  [s3 table function](../../../sql-reference/table-functions/s3.md)
--- a/docs/en/sql-reference/table-functions/s3.md
+++ b/docs/en/sql-reference/table-functions/s3.md
@ -3,7 +3,7 @@ toc_priority: 45
 toc_title: s3
 ---
-# S3 Table Function {#s3-table-function}
+# s3 Table Function {#s3-table-function}
 Provides table-like interface to select/insert files in [Amazon S3](https://aws.amazon.com/s3/). This table function is similar to [hdfs](../../sql-reference/table-functions/hdfs.md), but provides S3-specific features.
@ -125,6 +125,30 @@ INSERT INTO FUNCTION s3('https://storage.yandexcloud.net/my-test-bucket-768/test
 SELECT name, value FROM existing_table;
 ```
 ## Partitioned Write {#partitioned-write}
 If you specify `PARTITION BY` expression when inserting data into `S3` table, a separate file is created for each partition value. Splitting the data into separate files helps to improve reading operations efficiency.
 **Examples**
 1. Using partition ID in a key creates separate files:
 ```sql
 INSERT INTO TABLE FUNCTION
    s3('http://bucket.amazonaws.com/my_bucket/file_{_partition_id}.csv', 'CSV', 'a String, b UInt32, c UInt32')
    PARTITION BY a VALUES ('x', 2, 3), ('x', 4, 5), ('y', 11, 12), ('y', 13, 14), ('z', 21, 22), ('z', 23, 24);
 ```
 As a result, the data is written into three files: `file_x.csv`, `file_y.csv`, and `file_z.csv`.
 2. Using partition ID in a bucket name creates files in different buckets:
 ```sql
 INSERT INTO TABLE FUNCTION
    s3('http://bucket.amazonaws.com/my_bucket_{_partition_id}/file.csv', 'CSV', 'a UInt32, b UInt32, c UInt32')
    PARTITION BY a VALUES (1, 2, 3), (1, 4, 5), (10, 11, 12), (10, 13, 14), (20, 21, 22), (20, 23, 24);
 ```
 As a result, the data is written into three files in different buckets: `my_bucket_1/file.csv`, `my_bucket_10/file.csv`, and `my_bucket_20/file.csv`.
 **See Also**
 -   [S3 engine](../../engines/table-engines/integrations/s3.md)
--- a/docs/ru/engines/table-engines/integrations/s3.md
+++ b/docs/ru/engines/table-engines/integrations/s3.md
@ -151,4 +151,4 @@ ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/big_prefix/file-
 **Смотрите также**
-  [Табличная функция S3](../../../sql-reference/table-functions/s3.md)
+-  [Табличная функция s3](../../../sql-reference/table-functions/s3.md)
--- a/docs/ru/sql-reference/table-functions/s3.md
+++ b/docs/ru/sql-reference/table-functions/s3.md
@ -133,6 +133,30 @@ INSERT INTO FUNCTION s3('https://storage.yandexcloud.net/my-test-bucket-768/test
 SELECT name, value FROM existing_table;
 ```
 ## Партиционирование при записи данных {#partitioned-write}
 Если при добавлении данных в таблицу S3 указать выражение `PARTITION BY`, то для каждого значения ключа партиционирования создается отдельный файл. Это повышает эффективность операций чтения.
 **Примеры**
 1. При использовании ID партиции в имени ключа создаются отдельные файлы:
 ```sql
 INSERT INTO TABLE FUNCTION
    s3('http://bucket.amazonaws.com/my_bucket/file_{_partition_id}.csv', 'CSV', 'a UInt32, b UInt32, c UInt32')
    PARTITION BY a VALUES ('x', 2, 3), ('x', 4, 5), ('y', 11, 12), ('y', 13, 14), ('z', 21, 22), ('z', 23, 24);
 ```
 В результате данные будут записаны в три файла: `file_x.csv`, `file_y.csv` и `file_z.csv`.
 2. При использовании ID партиции в названии бакета создаются файлы в разных бакетах:
 ```sql
 INSERT INTO TABLE FUNCTION
    s3('http://bucket.amazonaws.com/my_bucket_{_partition_id}/file.csv', 'CSV', 'a UInt32, b UInt32, c UInt32')
    PARTITION BY a VALUES (1, 2, 3), (1, 4, 5), (10, 11, 12), (10, 13, 14), (20, 21, 22), (20, 23, 24);
 ```
 В результате будут созданы три файла в разных бакетах: `my_bucket_1/file.csv`, `my_bucket_10/file.csv` и `my_bucket_20/file.csv`.
 **Смотрите также**
 -  [Движок таблиц S3](../../engines/table-engines/integrations/s3.md)
`@ -210,4 +210,4 @@ ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/big_prefix/file-`

	`## See also`	`## See also`

	`- [S3 table function](../../../sql-reference/table-functions/s3.md)`	`- [s3 table function](../../../sql-reference/table-functions/s3.md)`
`@ -151,4 +151,4 @@ ENGINE = S3('https://storage.yandexcloud.net/my-test-bucket-768/big_prefix/file-`

	`Смотрите также`	`Смотрите также`

	`- [Табличная функция S3](../../../sql-reference/table-functions/s3.md)`	`- [Табличная функция s3](../../../sql-reference/table-functions/s3.md)`