ClickHouse/docs/zh/sql-reference/table-functions/hdfs.md

---
machine_translated: true
machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd
toc_priority: 45
toc_title: hdfs
---

# hdfs {#hdfs}

从HDFS中的文件创建表。 此表函数类似于 [url](url.md) 和 [文件](file.md) 一些的。

``` sql
hdfs(URI, format, structure)
```

**输入参数**

-   `URI` — The relative URI to the file in HDFS. Path to file support following globs in readonly mode: `*`, `?`, `{abc,def}` 和 `{N..M}` 哪里 `N`, `M` — numbers, \``'abc', 'def'` — strings.
-   `format` — The [格式](../../interfaces/formats.md#formats) 的文件。
-   `structure` — Structure of the table. Format `'column1_name column1_type, column2_name column2_type, ...'`.

**返回值**

具有指定结构的表，用于读取或写入指定文件中的数据。

**示例**

表从 `hdfs://hdfs1:9000/test` 并从中选择前两行:

``` sql
SELECT *
FROM hdfs('hdfs://hdfs1:9000/test', 'TSV', 'column1 UInt32, column2 UInt32, column3 UInt32')
LIMIT 2
```

``` text
┌─column1─┬─column2─┬─column3─┐
│       1 │       2 │       3 │
│       3 │       2 │       1 │
└─────────┴─────────┴─────────┘
```

**路径中的水珠**

多个路径组件可以具有globs。 对于正在处理的文件应该存在并匹配到整个路径模式（不仅后缀或前缀）。

-   `*` — Substitutes any number of any characters except `/` 包括空字符串。
-   `?` — Substitutes any single character.
-   `{some_string,another_string,yet_another_one}` — Substitutes any of strings `'some_string', 'another_string', 'yet_another_one'`.
-   `{N..M}` — Substitutes any number in range from N to M including both borders.

建筑与 `{}` 类似于 [远程表功能](../../sql-reference/table-functions/remote.md)).

**示例**

1.  假设我们在HDFS上有几个具有以下Uri的文件:

-   ‘hdfs://hdfs1:9000/some_dir/some_file_1’
-   ‘hdfs://hdfs1:9000/some_dir/some_file_2’
-   ‘hdfs://hdfs1:9000/some_dir/some_file_3’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_1’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_2’
-   ‘hdfs://hdfs1:9000/another_dir/some_file_3’

1.  查询这些文件中的行数:

<!-- -->

``` sql
SELECT count(*)
FROM hdfs('hdfs://hdfs1:9000/{some,another}_dir/some_file_{1..3}', 'TSV', 'name String, value UInt32')
```

1.  查询这两个目录的所有文件中的行数:

<!-- -->

``` sql
SELECT count(*)
FROM hdfs('hdfs://hdfs1:9000/{some,another}_dir/*', 'TSV', 'name String, value UInt32')
```

!!! warning "警告"
    如果您的文件列表包含带前导零的数字范围，请单独使用带大括号的构造或使用 `?`.

**示例**

从名为 `file000`, `file001`, … , `file999`:

``` sql
SELECT count(*)
FROM hdfs('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name String, value UInt32')
```

## 虚拟列 {#virtual-columns}

-   `_path` — Path to the file.
-   `_file` — Name of the file.

**另请参阅**

-   [虚拟列](https://clickhouse.tech/docs/en/operations/table_engines/#table_engines-virtual_columns)

[原始文章](https://clickhouse.tech/docs/en/query_language/table_functions/hdfs/) <!--hide-->
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
+								---
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								machine_translated: true
-												SQL reference refactoring (#10857)

* split up select.md

* array-join.md basic refactoring

* distinct.md basic refactoring

* format.md basic refactoring

* from.md basic refactoring

* group-by.md basic refactoring

* having.md basic refactoring

* additional index.md refactoring

* into-outfile.md basic refactoring

* join.md basic refactoring

* limit.md basic refactoring

* limit-by.md basic refactoring

* order-by.md basic refactoring

* prewhere.md basic refactoring

* adjust operators/index.md links

* adjust sample.md links

* adjust more links

* adjust operatots links

* fix some links

* adjust aggregate function article titles

* basic refactor of remaining select clauses

* absolute paths in make_links.sh

* run make_links.sh

* remove old select.md locations

* translate docs/es

* translate docs/fr

* translate docs/fa

* remove old operators.md location

* change operators.md links

* adjust links in docs/es

* adjust links in docs/es

* minor texts adjustments

* wip

* update machine translations to use new links

* fix changelog

* es build fixes

* get rid of some select.md links

* temporary adjust ru links

* temporary adjust more ru links

* improve curly brace handling

* adjust ru as well

* fa build fix

* ru link fixes

* zh link fixes

* temporary disable part of anchor checks
											
										
										
											2020-05-15 04:34:54 +00:00
+								machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								toc_priority: 45
 								toc_title: hdfs
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
+								---
 								# hdfs {#hdfs}
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								从HDFS中的文件创建表。 此表函数类似于 [url](url.md) 和 [文件](file.md) 一些的。
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
 								``` sql
 								hdfs(URI, format, structure)
 								```
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								**输入参数**
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								-   `URI` — The relative URI to the file in HDFS. Path to file support following globs in readonly mode: `*`, `?`, `{abc,def}` 和 `{N..M}` 哪里 `N`, `M` — numbers, \``'abc', 'def'` — strings.
 								-   `format` — The [格式](../../interfaces/formats.md#formats) 的文件。
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
+								-   `structure` — Structure of the table. Format `'column1_name column1_type, column2_name column2_type, ...'`.
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								**返回值**
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								具有指定结构的表，用于读取或写入指定文件中的数据。
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								**示例**
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								表从 `hdfs://hdfs1:9000/test` 并从中选择前两行:
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
 								``` sql
 								SELECT *
 								FROM hdfs('hdfs://hdfs1:9000/test', 'TSV', 'column1 UInt32, column2 UInt32, column3 UInt32')
 								LIMIT 2
 								```
 								``` text
 								┌─column1─┬─column2─┬─column3─┐
 								│       1 │       2 │       3 │
 								│       3 │       2 │       1 │
 								└─────────┴─────────┴─────────┘
 								```
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								**路径中的水珠**
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								多个路径组件可以具有globs。 对于正在处理的文件应该存在并匹配到整个路径模式（不仅后缀或前缀）。
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								-   `*` — Substitutes any number of any characters except `/` 包括空字符串。
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
+								-   `?` — Substitutes any single character.
 								-   `{some_string,another_string,yet_another_one}` — Substitutes any of strings `'some_string', 'another_string', 'yet_another_one'`.
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								-   `{N..M}` — Substitutes any number in range from N to M including both borders.
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								建筑与 `{}` 类似于 [远程表功能](../../sql-reference/table-functions/remote.md)).
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								**示例**
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+.  假设我们在HDFS上有几个具有以下Uri的文件:
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Fix broken links in docs

											
										
										
											2020-10-13 17:23:29 +00:00
+								-   ‘hdfs://hdfs1:9000/some_dir/some_file_1’
 								-   ‘hdfs://hdfs1:9000/some_dir/some_file_2’
 								-   ‘hdfs://hdfs1:9000/some_dir/some_file_3’
 								-   ‘hdfs://hdfs1:9000/another_dir/some_file_1’
 								-   ‘hdfs://hdfs1:9000/another_dir/some_file_2’
 								-   ‘hdfs://hdfs1:9000/another_dir/some_file_3’
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+.  查询这些文件中的行数:
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
 								<!-- -->
 								``` sql
 								SELECT count(*)
 								FROM hdfs('hdfs://hdfs1:9000/{some,another}_dir/some_file_{1..3}', 'TSV', 'name String, value UInt32')
 								```
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+.  查询这两个目录的所有文件中的行数:
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
 								<!-- -->
 								``` sql
 								SELECT count(*)
 								FROM hdfs('hdfs://hdfs1:9000/{some,another}_dir/*', 'TSV', 'name String, value UInt32')
 								```
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								!!! warning "警告"
 								    如果您的文件列表包含带前导零的数字范围，请单独使用带大括号的构造或使用 `?`.
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								**示例**
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								从名为 `file000`, `file001`, … , `file999`:
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
 								``` sql
 								SELECT count(*)
 								FROM hdfs('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name String, value UInt32')
 								```
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								## 虚拟列 {#virtual-columns}
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
 								-   `_path` — Path to the file.
 								-   `_file` — Name of the file.
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								**另请参阅**
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								-   [虚拟列](https://clickhouse.tech/docs/en/operations/table_engines/#table_engines-virtual_columns)
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
-												Update zh docs and fix en docs (#10125)


											
										
										
											2020-04-08 14:22:25 +00:00
+								[原始文章](https://clickhouse.tech/docs/en/query_language/table_functions/hdfs/) <!--hide-->