Translate Eng docs to Chinese.

Signed-off-by: Waterkin <1055905911@qq.com>
This commit is contained in:
Waterkin 2024-01-17 13:33:02 +08:00
parent 7291674013
commit ef044842c5
5 changed files with 84 additions and 84 deletions

View File

@ -1,27 +1,27 @@
---
slug: /zh/faq/general/ne-tormozit
title: "What does \u201C\u043D\u0435 \u0442\u043E\u0440\u043C\u043E\u0437\u0438\u0442\
\u201D mean?"
title: "\u201C\u043D\u0435 \u0442\u043E\u0440\u043C\u043E\u0437\u0438\u0442\
\u201D 是什么意思?"
toc_hidden: true
sidebar_position: 11
---
# What Does “Не тормозит” Mean? {#what-does-ne-tormozit-mean}
# “Не тормозит” 是什么意思? {#what-does-ne-tormozit-mean}
This question usually arises when people see official ClickHouse t-shirts. They have large words **“ClickHouse не тормозит”** on the front.
这个问题通常出现在人们看到官方 ClickHouse T恤时。它们的正面印有大字**“ClickHouse Не тормозит”**。
Before ClickHouse became open-source, it has been developed as an in-house storage system by the largest Russian IT company, [Yandex](https://yandex.com/company/). Thats why it initially got its slogan in Russian, which is “не тормозит” (pronounced as “ne tormozit”). After the open-source release we first produced some of those t-shirts for events in Russia and it was a no-brainer to use the slogan as-is.
在 ClickHouse 开源之前,它作为俄罗斯最大的 IT 公司 [Yandex](https://yandex.com/company/) 的内部存储系统而开发。这就是为什么它最初获得了俄文口号“Не тормозит”发音为“ne tormozit”。在开源发布后我们首先为俄罗斯的活动制作了一些这样的 T恤使用原汁原味的口号是理所当然的。
One of the following batches of those t-shirts was supposed to be given away on events outside of Russia and we tried to make the English version of the slogan. Unfortunately, the Russian language is kind of elegant in terms of expressing stuff and there was a restriction of limited space on a t-shirt, so we failed to come up with good enough translation (most options appeared to be either long or inaccurate) and decided to keep the slogan in Russian even on t-shirts produced for international events. It appeared to be a great decision because people all over the world get positively surprised and curious when they see it.
其中一批这样的 T恤原本打算在俄罗斯之外的活动中赠送我们尝试制作口号的英文版本。不幸的是俄语在表达事物方面有种优雅而且 T恤上的空间有限所以我们未能提出足够好的翻译大多数选项要么太长要么不够准确并决定即使在为国际活动制作的 T恤上也保留俄文口号。这被证明是一个绝妙的决定因为全世界的人们看到它时都会感到惊喜和好奇。
So, what does it mean? Here are some ways to translate *“не тормозит”*:
那么,它是什么意思呢?以下是翻译“Не тормозит”的一些方式:
- If you translate it literally, itd be something like *“ClickHouse does not press the brake pedal”*.
- If youd want to express it as close to how it sounds to a Russian person with IT background, itd be something like *“If your larger system lags, its not because it uses ClickHouse”*.
- Shorter, but not so precise versions could be *“ClickHouse is not slow”*, *“ClickHouse does not lag”* or just *“ClickHouse is fast”*.
- 如果你直译那就是“ClickHouse 不拖后腿”。
- 如果你想尽可能接近一个有 IT 背景的俄罗斯人的听觉感受,那就是“如果你的大型系统延迟,不是因为它使用了 ClickHouse”。
- 更短但不那么精确的版本可能是“ClickHouse 不慢”“ClickHouse 不卡顿”或仅仅“ClickHouse 很快”。
If you havent seen one of those t-shirts in person, you can check them out online in many ClickHouse-related videos. For example, this one:
如果您还没有亲眼见过这些 T恤可以在许多与 ClickHouse 相关的视频中在线查看。例如,这个:
![iframe](https://www.youtube.com/embed/bSyQahMVZ7w)
P.S. These t-shirts are not for sale, they are given away for free on most [ClickHouse Meetups](https://clickhouse.com/#meet), usually for best questions or other forms of active participation.
附言:这些 T恤不出售它们在大多数 [ClickHouse 聚会](https://clickhouse.com/#meet)上免费赠送,通常是给出最佳问题或其他形式的积极参与者。

View File

@ -1,63 +1,63 @@
---
slug: /zh/faq/general/why-clickhouse-is-so-fast
title: Why is ClickHouse so fast?
title: 为什么 ClickHouse 如此快速?
toc_hidden: true
sidebar_position: 8
---
# Why ClickHouse Is So Fast? {#why-clickhouse-is-so-fast}
# 为什么 ClickHouse 如此快速? {#why-clickhouse-is-so-fast}
It was designed to be fast. Query execution performance has always been a top priority during the development process, but other important characteristics like user-friendliness, scalability, and security were also considered so ClickHouse could become a real production system.
它被设计成一个快速的系统。在开发过程中,查询执行性能一直是首要考虑的优先级,但也考虑了其他重要特性,如用户友好性、可扩展性和安全性,使 ClickHouse 成为一个真正的生产系统。
ClickHouse was initially built as a prototype to do just a single task well: to filter and aggregate data as fast as possible. Thats what needs to be done to build a typical analytical report and thats what a typical [GROUP BY](../../sql-reference/statements/select/group-by.md) query does. ClickHouse team has made several high-level decisions that combined made achieving this task possible:
ClickHouse 最初是作为一个原型构建的,它的单一任务就是尽可能快速地过滤和聚合数据。这正是构建典型分析报告所需做的,也是典型 [GROUP BY](../../sql-reference/statements/select/group-by.md) 查询所做的。ClickHouse 团队做出了几个高层次的决策,这些决策组合在一起使得实现这一任务成为可能:
Column-oriented storage
: Source data often contain hundreds or even thousands of columns, while a report can use just a few of them. The system needs to avoid reading unnecessary columns, or most expensive disk read operations would be wasted.
列式存储
: 源数据通常包含数百甚至数千列,而报告可能只使用其中的几列。系统需要避免读取不必要的列,否则大部分昂贵的磁盘读取操作将被浪费。
Indexes
: ClickHouse keeps data structures in memory that allows reading not only used columns but only necessary row ranges of those columns.
索引
: ClickHouse 在内存中保留数据结构,允许不仅读取使用的列,而且只读取这些列的必要行范围。
Data compression
: Storing different values of the same column together often leads to better compression ratios (compared to row-oriented systems) because in real data column often has the same or not so many different values for neighboring rows. In addition to general-purpose compression, ClickHouse supports [specialized codecs](../../sql-reference/statements/create/table.mdx/#create-query-specialized-codecs) that can make data even more compact.
数据压缩
: 将同一列的不同值存储在一起通常会导致更好的压缩比与行式系统相比因为在实际数据中列通常对相邻行有相同或不太多的不同值。除了通用压缩之外ClickHouse 还支持 [专用编解码器](../../sql-reference/statements/create/table.mdx/#create-query-specialized-codecs),可以使数据更加紧凑。
Vectorized query execution
: ClickHouse not only stores data in columns but also processes data in columns. It leads to better CPU cache utilization and allows for [SIMD](https://en.wikipedia.org/wiki/SIMD) CPU instructions usage.
向量化查询执行
: ClickHouse 不仅以列的形式存储数据,而且以列的形式处理数据。这导致更好的 CPU 缓存利用率,并允许使用 [SIMD](https://en.wikipedia.org/wiki/SIMD) CPU 指令。
Scalability
: ClickHouse can leverage all available CPU cores and disks to execute even a single query. Not only on a single server but all CPU cores and disks of a cluster as well.
可扩展性
: ClickHouse 可以利用所有可用的 CPU 核心和磁盘来执行甚至是单个查询。不仅在单个服务器上,而且在集群的所有 CPU 核心和磁盘上。
But many other database management systems use similar techniques. What really makes ClickHouse stand out is **attention to low-level details**. Most programming languages provide implementations for most common algorithms and data structures, but they tend to be too generic to be effective. Every task can be considered as a landscape with various characteristics, instead of just throwing in random implementation. For example, if you need a hash table, here are some key questions to consider:
但许多其他数据库管理系统也使用类似的技术。真正使 ClickHouse 脱颖而出的是 **对底层细节的关注**。大多数编程语言为最常见的算法和数据结构提供了实现,但它们往往过于通用而无法高效。每个任务都可以被视为具有各种特征的景观,而不是仅仅随意投入某个实现。例如,如果您需要一个哈希表,这里有一些关键问题需要考虑:
- Which hash function to choose?
- Collision resolution algorithm: [open addressing](https://en.wikipedia.org/wiki/Open_addressing) vs [chaining](https://en.wikipedia.org/wiki/Hash_table#Separate_chaining)?
- Memory layout: one array for keys and values or separate arrays? Will it store small or large values?
- Fill factor: when and how to resize? How to move values around on resize?
- Will values be removed and which algorithm will work better if they will?
- Will we need fast probing with bitmaps, inline placement of string keys, support for non-movable values, prefetch, and batching?
- 选择哪种哈希函数?
- 冲突解决算法:[开放寻址](https://en.wikipedia.org/wiki/Open_addressing)还是[链接](https://en.wikipedia.org/wiki/Hash_table#Separate_chaining)
- 内存布局:一个数组用于键和值还是分开的数组?它会存储小值还是大值?
- 填充因子:何时以及如何调整大小?在调整大小时如何移动值?
- 是否会移除值,如果会,哪种算法会更好?
- 我们是否需要使用位图进行快速探测,字符串键的内联放置,对不可移动值的支持,预取和批处理?
Hash table is a key data structure for `GROUP BY` implementation and ClickHouse automatically chooses one of [30+ variations](https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/Aggregator.h) for each specific query.
哈希表是 `GROUP BY` 实现的关键数据结构ClickHouse 会根据每个特定查询自动选择 [30 多种变体](https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/Aggregator.h) 中的一种。
The same goes for algorithms, for example, in sorting you might consider:
算法也是如此,例如,在排序中,您可能会考虑:
- What will be sorted: an array of numbers, tuples, strings, or structures?
- Is all data available completely in RAM?
- Do we need a stable sort?
- Do we need a full sort? Maybe partial sort or n-th element will suffice?
- How to implement comparisons?
- Are we sorting data that has already been partially sorted?
- 将要排序的是数字数组、元组、字符串还是结构?
- 所有数据是否完全可用于 RAM
- 我们需要稳定排序吗?
- 我们需要完全排序吗?也许部分排序或第 n 个元素就足够了?
- 如何实现比较?
- 我们正在对已经部分排序的数据进行排序吗?
Algorithms that they rely on characteristics of data they are working with can often do better than their generic counterparts. If it is not really known in advance, the system can try various implementations and choose the one that works best in runtime. For example, see an [article on how LZ4 decompression is implemented in ClickHouse](https://habr.com/en/company/yandex/blog/457612/).
他们所依赖的算法根据其所处理的数据特性,往往可以比通用算法做得更好。如果事先真的不知道,系统可以尝试各种实现,并在运行时选择最佳的一种。例如,看一篇关于 [ClickHouse 中 LZ4 解压缩是如何实现的文章](https://habr.com/en/company/yandex/blog/457612/)。
Last but not least, the ClickHouse team always monitors the Internet on people claiming that they came up with the best implementation, algorithm, or data structure to do something and tries it out. Those claims mostly appear to be false, but from time to time youll indeed find a gem.
最后但同样重要的是ClickHouse 团队始终关注互联网上人们声称他们提出了最佳的实现、算法或数据结构来做某事,并尝试它。这些声称大多是虚假的,但有时你确实会找到一颗宝石。
:::info Tips for building your own high-performance software
- Keep in mind low-level details when designing your system.
- Design based on hardware capabilities.
- Choose data structures and abstractions based on the needs of the task.
- Provide specializations for special cases.
- Try new, “best” algorithms, that you read about yesterday.
- Choose an algorithm in runtime based on statistics.
- Benchmark on real datasets.
- Test for performance regressions in CI.
- Measure and observe everything.
:::info 构建高性能软件的提示
- 设计系统时要考虑到底层细节。
- 基于硬件能力进行设计。
- 根据任务的需求选择数据结构和抽象。
- 为特殊情况提供专门化。
- 尝试您昨天阅读的关于新的“最佳”算法。
- 根据统计数据在运行时选择算法。
- 在真实数据集上进行基准测试。
- 在 CI 中测试性能回归。
- 测量并观察一切。
:::

View File

@ -1,35 +1,35 @@
---
slug: /zh/faq/integration/json-import
title: How to import JSON into ClickHouse?
title: 如何将 JSON 导入到 ClickHouse
toc_hidden: true
sidebar_position: 11
---
# How to Import JSON Into ClickHouse? {#how-to-import-json-into-clickhouse}
# 如何将 JSON 导入到 ClickHouse {#how-to-import-json-into-clickhouse}
ClickHouse supports a wide range of [data formats for input and output](../../interfaces/formats.md). There are multiple JSON variations among them, but the most commonly used for data ingestion is [JSONEachRow](../../interfaces/formats.md#jsoneachrow). It expects one JSON object per row, each object separated by a newline.
ClickHouse 支持多种[输入和输出的数据格式](../../interfaces/formats.md)。其中包括多种 JSON 变体,但最常用于数据导入的是 [JSONEachRow](../../interfaces/formats.md#jsoneachrow)。它期望每行一个 JSON 对象,每个对象由一个新行分隔。
## Examples {#examples}
## 示例 {#examples}
Using [HTTP interface](../../interfaces/http.md):
使用 [HTTP 接口](../../interfaces/http.md)
``` bash
$ echo '{"foo":"bar"}' | curl 'http://localhost:8123/?query=INSERT%20INTO%20test%20FORMAT%20JSONEachRow' --data-binary @-
```
Using [CLI interface](../../interfaces/cli.md):
使用 [CLI接口](../../interfaces/cli.md):
``` bash
$ echo '{"foo":"bar"}' | clickhouse-client --query="INSERT INTO test FORMAT JSONEachRow"
```
Instead of inserting data manually, you might consider to use one of [client libraries](../../interfaces/index.md) instead.
除了手动插入数据外,您可能会考虑使用 [客户端库](../../interfaces/index.md) 之一。
## Useful Settings {#useful-settings}
## 实用设置 {#useful-settings}
- `input_format_skip_unknown_fields` allows to insert JSON even if there were additional fields not present in table schema (by discarding them).
- `input_format_import_nested_json` allows to insert nested JSON objects into columns of [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) type.
- `input_format_skip_unknown_fields` 允许插入 JSON即使存在表格架构中未出现的额外字段通过丢弃它们
- `input_format_import_nested_json` 允许将嵌套 JSON 对象插入到 [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) 类型的列中。
:::note
Settings are specified as `GET` parameters for the HTTP interface or as additional command-line arguments prefixed with `--` for the `CLI` interface.
对于 HTTP 接口,设置作为 GET 参数指定;对于 CLI 接口,则作为前缀为 -- 的附加命令行参数。
:::

View File

@ -1,16 +1,16 @@
---
slug: /zh/faq/integration/oracle-odbc
title: What if I have a problem with encodings when using Oracle via ODBC?
title: 使用 Oracle ODBC 时遇到编码问题怎么办?
toc_hidden: true
sidebar_position: 20
---
# What If I Have a Problem with Encodings When Using Oracle Via ODBC? {#oracle-odbc-encodings}
# 使用 Oracle ODBC 时遇到编码问题怎么办? {#oracle-odbc-encodings}
If you use Oracle as a source of ClickHouse external dictionaries via Oracle ODBC driver, you need to set the correct value for the `NLS_LANG` environment variable in `/etc/default/clickhouse`. For more information, see the [Oracle NLS_LANG FAQ](https://www.oracle.com/technetwork/products/globalization/nls-lang-099431.html).
如果您使用 Oracle 作为 ClickHouse 外部字典的数据源,并通过 Oracle ODBC 驱动程序,您需要在 `/etc/default/clickhouse` 中为 `NLS_LANG` 环境变量设置正确的值。更多信息,请参阅 [Oracle NLS_LANG FAQ](https://www.oracle.com/technetwork/products/globalization/nls-lang-099431.html)
**Example**
**示例**
``` sql
NLS_LANG=RUSSIAN_RUSSIA.UTF8
```
```

View File

@ -1,44 +1,44 @@
---
slug: /zh/faq/operations/delete-old-data
title: Is it possible to delete old records from a ClickHouse table?
title: 是否可以从ClickHouse表中删除旧记录
toc_hidden: true
sidebar_position: 20
---
# Is It Possible to Delete Old Records from a ClickHouse Table? {#is-it-possible-to-delete-old-records-from-a-clickhouse-table}
# 是否可以从ClickHouse表中删除旧记录 {#is-it-possible-to-delete-old-records-from-a-clickhouse-table}
The short answer is “yes”. ClickHouse has multiple mechanisms that allow freeing up disk space by removing old data. Each mechanism is aimed for different scenarios.
简短的答案是“可以”。ClickHouse具有多种机制允许通过删除旧数据来释放磁盘空间。每种机制都针对不同的场景。
## TTL {#ttl}
ClickHouse allows to automatically drop values when some condition happens. This condition is configured as an expression based on any columns, usually just static offset for any timestamp column.
ClickHouse允许在某些条件发生时自动删除值。这个条件被配置为基于任何列的表达式,通常只是针对任何时间戳列的静态偏移量。
The key advantage of this approach is that it does not need any external system to trigger, once TTL is configured, data removal happens automatically in background.
这种方法的主要优势是它不需要任何外部系统来触发一旦配置了TTL数据删除就会自动在后台发生。
:::note
TTL can also be used to move data not only to [/dev/null](https://en.wikipedia.org/wiki/Null_device), but also between different storage systems, like from SSD to HDD.
TTL也可以用来将数据移动到非 [/dev/null](https://en.wikipedia.org/wiki/Null_device) 的不同存储系统例如从SSD到HDD。
:::
More details on [configuring TTL](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl).
有关 [配置TTL](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl) 的更多详细信息。
## ALTER DELETE {#alter-delete}
ClickHouse does not have real-time point deletes like in [OLTP](https://en.wikipedia.org/wiki/Online_transaction_processing) databases. The closest thing to them are mutations. They are issued as `ALTER ... DELETE` or `ALTER ... UPDATE` queries to distinguish from normal `DELETE` or `UPDATE` as they are asynchronous batch operations, not immediate modifications. The rest of syntax after `ALTER TABLE` prefix is similar.
ClickHouse没有像[OLTP](https://en.wikipedia.org/wiki/Online_transaction_processing)数据库那样的实时点删除。最接近的东西是突变。它们被发布为`ALTER ... DELETE`或`ALTER ... UPDATE`查询,以区别于普通的`DELETE`或`UPDATE`,因为它们是异步批处理操作,而不是立即修改。`ALTER TABLE`前缀后的其余语法相似。
`ALTER DELETE` can be issued to flexibly remove old data. If you need to do it regularly, the main downside will be the need to have an external system to submit the query. There are also some performance considerations since mutation rewrite complete parts even theres only a single row to be deleted.
`ALTER DELETE`可以灵活地用来删除旧数据。如果你需要定期这样做,主要缺点将是需要有一个外部系统来提交查询。还有一些性能方面的考虑,因为即使只有一行要被删除,突变也会重写完整部分。
This is the most common approach to make your system based on ClickHouse [GDPR](https://gdpr-info.eu)-compliant.
这是使基于ClickHouse的系统符合[GDPR](https://gdpr-info.eu)的最常见方法。
More details on [mutations](../../sql-reference/statements/alter.md/#alter-mutations).
有关[突变](../../sql-reference/statements/alter.md/#alter-mutations)的更多详细信息。
## DROP PARTITION {#drop-partition}
`ALTER TABLE ... DROP PARTITION` provides a cost-efficient way to drop a whole partition. Its not that flexible and needs proper partitioning scheme configured on table creation, but still covers most common cases. Like mutations need to be executed from an external system for regular use.
`ALTER TABLE ... DROP PARTITION`提供了一种成本效率高的方式来删除整个分区。它不是那么灵活,需要在创建表时配置适当的分区方案,但仍然涵盖了大多数常见情况。像突变一样,需要从外部系统执行以进行常规使用。
More details on [manipulating partitions](../../sql-reference/statements/alter/partition.mdx/#alter_drop-partition).
有关[操作分区](../../sql-reference/statements/alter/partition.mdx/#alter_drop-partition)的更多详细信息。
## TRUNCATE {#truncate}
Its rather radical to drop all data from a table, but in some cases it might be exactly what you need.
从表中删除所有数据是相当激进的,但在某些情况下可能正是您所需要的。
More details on [table truncation](../../sql-reference/statements/truncate.md).
有关[表截断](../../sql-reference/statements/truncate.md)的更多详细信息。