ClickHouse/docs/en/faq/general.md

# General Questions {#general-questions}

## Why Not Use Something Like MapReduce? {#why-not-use-something-like-mapreduce}

We can refer to systems like MapReduce as distributed computing systems in which the reduce operation is based on distributed sorting. The most common open-source solution in this class is [Apache Hadoop](http://hadoop.apache.org). Yandex uses its in-house solution, YT.

These systems aren’t appropriate for online queries due to their high latency. In other words, they can’t be used as the back-end for a web interface. These types of systems aren’t useful for real-time data updates. Distributed sorting isn’t the best way to perform reduce operations if the result of the operation and all the intermediate results (if there are any) are located in the RAM of a single server, which is usually the case for online queries. In such a case, a hash table is an optimal way to perform reduce operations. A common approach to optimizing map-reduce tasks is pre-aggregation (partial reduce) using a hash table in RAM. The user performs this optimization manually. Distributed sorting is one of the main causes of reduced performance when running simple map-reduce tasks.

Most MapReduce implementations allow you to execute arbitrary code on a cluster. But a declarative query language is better suited to OLAP to run experiments quickly. For example, Hadoop has Hive and Pig. Also consider Cloudera Impala or Shark (outdated) for Spark, as well as Spark SQL, Presto, and Apache Drill. Performance when running such tasks is highly sub-optimal compared to specialized systems, but relatively high latency makes it unrealistic to use these systems as the backend for a web interface.

## What If I Have a Problem with Encodings When Using Oracle Through ODBC? {#oracle-odbc-encodings}

If you use Oracle through the ODBC driver as a source of external dictionaries, you need to set the correct value for the `NLS_LANG` environment variable in `/etc/default/clickhouse`. For more information, see the [Oracle NLS\_LANG FAQ](https://www.oracle.com/technetwork/products/globalization/nls-lang-099431.html).

**Example**

``` sql
NLS_LANG=RUSSIAN_RUSSIA.UTF8
```

## How Do I Export Data from ClickHouse to a File? {#how-to-export-to-file}

### Using INTO OUTFILE Clause {#using-into-outfile-clause}

Add an [INTO OUTFILE](../query_language/select/#into-outfile-clause) clause to your query.

For example:

``` sql
SELECT * FROM table INTO OUTFILE 'file'
```

By default, ClickHouse uses the [TabSeparated](../interfaces/formats.md#tabseparated) format for output data. To select the [data format](../interfaces/formats.md), use the [FORMAT clause](../query_language/select/#format-clause).

For example:

``` sql
SELECT * FROM table INTO OUTFILE 'file' FORMAT CSV
```

### Using a File-Engine Table {#using-a-file-engine-table}

See [File](../operations/table_engines/file.md).

### Using Command-Line Redirection {#using-command-line-redirection}

``` sql
$ clickhouse-client --query "SELECT * from table" --format FormatName > result.txt
```

See [clickhouse-client](../interfaces/cli.md).

{## [Original article](https://clickhouse.tech/docs/en/faq/general/) ##}
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								# General Questions {#general-questions}
-												WIP on docs (#2753)

* Some improvements for introduction/performance.md

* Minor improvements for example_datasets

* Add website/package-lock.json to .gitignore

* YT paragraph was badly outdated and there is no real reason to write a new one

* Use weird introduction article as a starting point for F.A.Q.

* Some refactoring of first half of ya_metrika_task.md

* minor

* Weird docs footer bugfix

											
										
										
											2018-07-30 16:34:55 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								## Why Not Use Something Like MapReduce? {#why-not-use-something-like-mapreduce}
-												WIP on docs (#2753)

* Some improvements for introduction/performance.md

* Minor improvements for example_datasets

* Add website/package-lock.json to .gitignore

* YT paragraph was badly outdated and there is no real reason to write a new one

* Use weird introduction article as a starting point for F.A.Q.

* Some refactoring of first half of ya_metrika_task.md

* minor

* Weird docs footer bugfix

											
										
										
											2018-07-30 16:34:55 +00:00
-												Docs grammar fixes (#9745)


											
										
										
											2020-03-19 06:53:47 +00:00
+								We can refer to systems like MapReduce as distributed computing systems in which the reduce operation is based on distributed sorting. The most common open-source solution in this class is [Apache Hadoop](http://hadoop.apache.org). Yandex uses its in-house solution, YT.
-												WIP on docs (#2753)

* Some improvements for introduction/performance.md

* Minor improvements for example_datasets

* Add website/package-lock.json to .gitignore

* YT paragraph was badly outdated and there is no real reason to write a new one

* Use weird introduction article as a starting point for F.A.Q.

* Some refactoring of first half of ya_metrika_task.md

* minor

* Weird docs footer bugfix

											
										
										
											2018-07-30 16:34:55 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								These systems aren’t appropriate for online queries due to their high latency. In other words, they can’t be used as the back-end for a web interface. These types of systems aren’t useful for real-time data updates. Distributed sorting isn’t the best way to perform reduce operations if the result of the operation and all the intermediate results (if there are any) are located in the RAM of a single server, which is usually the case for online queries. In such a case, a hash table is an optimal way to perform reduce operations. A common approach to optimizing map-reduce tasks is pre-aggregation (partial reduce) using a hash table in RAM. The user performs this optimization manually. Distributed sorting is one of the main causes of reduced performance when running simple map-reduce tasks.
-												WIP on docs (#2753)

* Some improvements for introduction/performance.md

* Minor improvements for example_datasets

* Add website/package-lock.json to .gitignore

* YT paragraph was badly outdated and there is no real reason to write a new one

* Use weird introduction article as a starting point for F.A.Q.

* Some refactoring of first half of ya_metrika_task.md

* minor

* Weird docs footer bugfix

											
										
										
											2018-07-30 16:34:55 +00:00
-												Docs grammar fixes (#9745)


											
										
										
											2020-03-19 06:53:47 +00:00
+								Most MapReduce implementations allow you to execute arbitrary code on a cluster. But a declarative query language is better suited to OLAP to run experiments quickly. For example, Hadoop has Hive and Pig. Also consider Cloudera Impala or Shark (outdated) for Spark, as well as Spark SQL, Presto, and Apache Drill. Performance when running such tasks is highly sub-optimal compared to specialized systems, but relatively high latency makes it unrealistic to use these systems as the backend for a web interface.
-												WIP on docs (#2753)

* Some improvements for introduction/performance.md

* Minor improvements for example_datasets

* Add website/package-lock.json to .gitignore

* YT paragraph was badly outdated and there is no real reason to write a new one

* Use weird introduction article as a starting point for F.A.Q.

* Some refactoring of first half of ya_metrika_task.md

* minor

* Weird docs footer bugfix

											
										
										
											2018-07-30 16:34:55 +00:00
-												remove extra space (#9736)


											
										
										
											2020-03-18 18:43:51 +00:00
+								## What If I Have a Problem with Encodings When Using Oracle Through ODBC? {#oracle-odbc-encodings}
-												DOCAPI-3822: ZooKeeper settings. Oracle ODBC issue.

											
										
										
											2019-04-24 07:39:53 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								If you use Oracle through the ODBC driver as a source of external dictionaries, you need to set the correct value for the `NLS_LANG` environment variable in `/etc/default/clickhouse`. For more information, see the [Oracle NLS\_LANG FAQ](https://www.oracle.com/technetwork/products/globalization/nls-lang-099431.html).
-												DOCAPI-3822: ZooKeeper settings. Oracle ODBC issue.

											
										
										
											2019-04-24 07:39:53 +00:00
 								**Example**
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												DOCAPI-3822: ZooKeeper settings. Oracle ODBC issue.

											
										
										
											2019-04-24 07:39:53 +00:00
+								NLS_LANG=RUSSIAN_RUSSIA.UTF8
 								```
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
-												remove extra space (#9736)


											
										
										
											2020-03-18 18:43:51 +00:00
+								## How Do I Export Data from ClickHouse to a File? {#how-to-export-to-file}
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								### Using INTO OUTFILE Clause {#using-into-outfile-clause}
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
-												Update general.md (#81)


											
										
										
											2020-01-15 12:24:57 +00:00
+								Add an [INTO OUTFILE](../query_language/select/#into-outfile-clause) clause to your query.
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
 								For example:
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
+								SELECT * FROM table INTO OUTFILE 'file'
 								```
 								By default, ClickHouse uses the [TabSeparated](../interfaces/formats.md#tabseparated) format for output data. To select the [data format](../interfaces/formats.md), use the [FORMAT clause](../query_language/select/#format-clause).
 								For example:
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
+								SELECT * FROM table INTO OUTFILE 'file' FORMAT CSV
 								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								### Using a File-Engine Table {#using-a-file-engine-table}
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
 								See [File](../operations/table_engines/file.md).
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								### Using Command-Line Redirection {#using-command-line-redirection}
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												CLICKHOUSEDOCS-496: Added --format option to clickhouse-client.

											
										
										
											2020-01-20 10:22:30 +00:00
+								$ clickhouse-client --query "SELECT * from table" --format FormatName > result.txt
-												CLICKHOUSEDOCS-508: Added question about how to export data to a file.

											
										
										
											2019-12-27 12:46:41 +00:00
+								```
 								See [clickhouse-client](../interfaces/cli.md).
-												trigger ci
											
										
										
											2020-03-22 13:37:23 +00:00
+								{## [Original article](https://clickhouse.tech/docs/en/faq/general/) ##}