normalize markdown

This commit is contained in:
Ivan Blinkov 2020-06-22 16:39:48 +03:00
parent 652882ef4e
commit bd964cb768
10 changed files with 74 additions and 74 deletions

View File

@ -1,11 +1,11 @@
---
toc_priority: 10
toc_hidden: true
toc_priority: 10
---
# What Does "ClickHouse" Mean?
# What Does “ClickHouse” Mean? {#what-does-clickhouse-mean}
It's a combination of "**Click**stream" and "Data ware**house**". It comes from the original use case at Yandex.Metrica, where ClickHouse was supposed to keep records of all clicks by people from all over the Internet and it still does the job. You can read more about this use case on [ClickHouse history](../../introduction/history.md) page.
Its a combination of “**Click**stream” and “Data ware**house**”. It comes from the original use case at Yandex.Metrica, where ClickHouse was supposed to keep records of all clicks by people from all over the Internet and it still does the job. You can read more about this use case on [ClickHouse history](../../introduction/history.md) page.
!!! info "Fun fact"
Many years after ClickHouse got its name, this approach of combining two words that are meaningful on their own has been highlighted as the best way to name a database in a [research by Andy Pavlo](https://www.cs.cmu.edu/~pavlo/blog/2020/03/on-naming-a-database-management-system.html), an Associate Professor of Databases at Carnegie Mellon University. ClickHouse shared his "best database name of all time" award with Postgres.
Many years after ClickHouse got its name, this approach of combining two words that are meaningful on their own has been highlighted as the best way to name a database in a [research by Andy Pavlo](https://www.cs.cmu.edu/~pavlo/blog/2020/03/on-naming-a-database-management-system.html), an Associate Professor of Databases at Carnegie Mellon University. ClickHouse shared his “best database name of all time” award with Postgres.

View File

@ -1,18 +1,18 @@
---
toc_hidden_folder: true
toc_priority: 1
toc_title: General
toc_hidden_folder: true
---
# General Questions about ClickHouse {#general-questions}
# General Questions About ClickHouse {#general-questions}
Questions:
- [What does "ClickHouse" mean?](dbms-naming.md)
- [What does "Не тормозит" mean?](ne-tormozit.md)
- [Why not use something like MapReduce?](mapreduce.md)
- [What does “ClickHouse” mean?](../../faq/general/dbms-naming.md)
- [What does “Не тормозит” mean?](../../faq/general/ne-tormozit.md)
- [Why not use something like MapReduce?](../../faq/general/mapreduce.md)
!!! info "Not see what you were locking for?"
Check out [other F.A.Q. categories](../index.md) or browse around main documentation articles found in the left sidebar.
Check out [other F.A.Q. categories](../../faq/index.md) or browse around main documentation articles found in the left sidebar.
{## [Original article](https://clickhouse.tech/docs/en/faq/general/) ##}

View File

@ -1,9 +1,9 @@
---
toc_priority: 20
toc_hidden: true
toc_priority: 20
---
# Why not Use Something Like MapReduce? {#why-not-use-something-like-mapreduce}
# Why Not Use Something Like MapReduce? {#why-not-use-something-like-mapreduce}
We can refer to systems like MapReduce as distributed computing systems in which the reduce operation is based on distributed sorting. The most common open-source solution in this class is [Apache Hadoop](http://hadoop.apache.org). Yandex uses its in-house solution, YT.

View File

@ -1,23 +1,23 @@
---
toc_priority: 11
toc_hidden: true
toc_priority: 11
---
# What does "Не тормозит" mean?
# What Does “Не тормозит” mean? {#what-does-ne-tormozit-mean}
This question usually arises when people see official ClickHouse t-shirts. They have large words **"ClickHouse не тормозит"** on the front.
This question usually arises when people see official ClickHouse t-shirts. They have large words **“ClickHouse не тормозит”** on the front.
Before ClickHouse became open-source, it has been developed as an in-house storage system by the largest Russian IT company, [Yandex](https://yandex.com/company/). That's why it initially got its slogan in Russian, which is "не тормозит". After the open-source release we first produced some of those t-shirts for events in Russia and it was a no-brainer to use the slogan as-is.
Before ClickHouse became open-source, it has been developed as an in-house storage system by the largest Russian IT company, [Yandex](https://yandex.com/company/). Thats why it initially got its slogan in Russian, which is “не тормозит”. After the open-source release we first produced some of those t-shirts for events in Russia and it was a no-brainer to use the slogan as-is.
One of the following batches of those t-shirts was supposed to be given away on events outside of Russia and we tried to make the English version of the slogan. Unfortunately, the Russian language is kind of elegant in terms of expressing stuff and there was a restriction of limited space on a t-shirt, so we failed to come up with good enough translation (most options appeared to be either long or inaccurate) and decided to keep the slogan in Russian even on t-shirts produced for international events. It appeared to be a great decision because people all over the world get positively surprised and curious when they see it.
So, what does it mean? Here are some ways to translate *"не тормозит"*:
So, what does it mean? Here are some ways to translate *“не тормозит”*:
- If you translate it literally, it'd be something like *"ClickHouse doesn't press the brake pedal"*.
- If you'd want to express it as close to how it sounds to a Russian person with IT background, it'd be something like *"If you larger system lags, it's not because it uses ClickHouse"*.
- Shorter, but not so precise versions could be *"ClickHouse is not slow"*, *"ClickHouse doesn't lag"* or just *"ClickHouse is fast"*.
- If you translate it literally, itd be something like *“ClickHouse doesnt press the brake pedal”*.
- If youd want to express it as close to how it sounds to a Russian person with IT background, itd be something like *“If you larger system lags, its not because it uses ClickHouse”*.
- Shorter, but not so precise versions could be *“ClickHouse is not slow”*, *“ClickHouse doesnt lag”* or just *“ClickHouse is fast”*.
If you haven't seen one of those t-shirts in person, you can check them out online in many ClickHouse-related videos. For example, this one:
If you havent seen one of those t-shirts in person, you can check them out online in many ClickHouse-related videos. For example, this one:
![iframe](https://www.youtube.com/embed/bSyQahMVZ7w)

View File

@ -4,7 +4,7 @@ toc_hidden: true
toc_priority: 76
---
# ClickHouse F.A.Q.
# ClickHouse F.A.Q {#clickhouse-f-a-q}
This section of the documentation is a place to collect answers to ClickHouse-related questions that arise often.

View File

@ -1,6 +1,6 @@
---
toc_priority: 10
toc_hidden: true
toc_priority: 10
---
# How Do I Export Data from ClickHouse to a File? {#how-to-export-to-file}

View File

@ -1,17 +1,17 @@
---
toc_hidden_folder: true
toc_priority: 3
toc_title: Integration
toc_hidden_folder: true
---
# Question about Integrating ClickHouse and Other Systems
# Question About Integrating ClickHouse and Other Systems {#question-about-integrating-clickhouse-and-other-systems}
Questions:
- [How do I export data from ClickHouse to a file?](file-export.md)
- [What if I Have a problem with encodings when connecting to Oracle via ODBC?](oracle-odbc.md)
- [How do I export data from ClickHouse to a file?](../../faq/integration/file-export.md)
- [What if I Have a problem with encodings when connecting to Oracle via ODBC?](../../faq/integration/oracle-odbc.md)
!!! info "Not see what you were locking for?"
Check out [other F.A.Q. categories](../index.md) or browse around main documentation articles found in the left sidebar.
Check out [other F.A.Q. categories](../../faq/index.md) or browse around main documentation articles found in the left sidebar.
{## [Original article](https://clickhouse.tech/docs/en/faq/integration/) ##}

View File

@ -1,9 +1,9 @@
---
toc_priority: 20
toc_hidden: true
toc_priority: 20
---
# What If I Have a Problem with Encodings When Using Oracle via ODBC? {#oracle-odbc-encodings}
# What If I Have a Problem with Encodings When Using Oracle Via ODBC? {#oracle-odbc-encodings}
If you use Oracle as a source of ClickHouse external dictionaries via Oracle ODBC driver, you need to set the correct value for the `NLS_LANG` environment variable in `/etc/default/clickhouse`. For more information, see the [Oracle NLS\_LANG FAQ](https://www.oracle.com/technetwork/products/globalization/nls-lang-099431.html).

View File

@ -1,16 +1,16 @@
---
toc_hidden_folder: true
toc_priority: 2
toc_title: Operations
toc_hidden_folder: true
---
# Question about Operating ClickHouse Servers and Clusters
# Question About Operating ClickHouse Servers and Clusters {#question-about-operating-clickhouse-servers-and-clusters}
Questions:
- [Which ClickHouse version to use in production?](production.md)
- [Which ClickHouse version to use in production?](../../faq/operations/production.md)
!!! info "Not see what you were locking for?"
Check out [other F.A.Q. categories](../index.md) or browse around main documentation articles found in the left sidebar.
Check out [other F.A.Q. categories](../../faq/index.md) or browse around main documentation articles found in the left sidebar.
{## [Original article](https://clickhouse.tech/docs/en/faq/production/) ##}

View File

@ -1,38 +1,38 @@
---
toc_priority: 10
toc_hidden: true
toc_priority: 10
---
# Which ClickHouse Version to Use in Production?
# Which ClickHouse Version to Use in Production? {#which-clickhouse-version-to-use-in-production}
First of all, let's discuss why people ask this question in the first place. There are two key reasons:
First of all, lets discuss why people ask this question in the first place. There are two key reasons:
1. ClickHouse is developed with pretty high velocity and usually, there are 10+ stable releases per year. It makes a wide range of releases to choose from, which is not so trivial choice.
2. Some users want to avoid spending time figuring out which version works best for their use case and just follow someone else's advice.
2. Some users want to avoid spending time figuring out which version works best for their use case and just follow someone elses advice.
The second reason is more fundamental, so we'll start with it and then get back to navigating through various ClickHouse releases.
The second reason is more fundamental, so well start with it and then get back to navigating through various ClickHouse releases.
## Which ClickHouse Version Do You Recommend?
## Which ClickHouse Version Do You Recommend? {#which-clickhouse-version-do-you-recommend}
It's tempting to hire consultants or trust some known experts to get rid of responsibility for your production environment. You install some specific ClickHouse version that someone else recommended, now if there's some issue with it - it's not your fault, it's someone else's. This line of reasoning is a big trap. No external person knows better what's going on in your company's production environment.
Its tempting to hire consultants or trust some known experts to get rid of responsibility for your production environment. You install some specific ClickHouse version that someone else recommended, now if theres some issue with it - its not your fault, its someone elses. This line of reasoning is a big trap. No external person knows better whats going on in your companys production environment.
So how to properly choose which ClickHouse version to upgrade to? Or how to choose your first ClickHouse version? First of all, you need to invest in setting up a **realistic pre-production environment**. In an ideal world, it could be a completely identical shadow copy, but that's usually expensive.
So how to properly choose which ClickHouse version to upgrade to? Or how to choose your first ClickHouse version? First of all, you need to invest in setting up a **realistic pre-production environment**. In an ideal world, it could be a completely identical shadow copy, but thats usually expensive.
Here're some key points to get reasonable fidelity in a pre-production environment with not so high costs:
Herere some key points to get reasonable fidelity in a pre-production environment with not so high costs:
- Pre-production environment needs to run an as close set of queries as you intend to run in production:
- Don't make it read-only with some frozen data.
- Don't make it write-only with just copying data without building some typical reports.
- Don't wipe it clean instead of applying schema migrations.
- Use a sample of real production data and queries. Try to choose a sample that's still representative and makes `SELECT` queries return reasonable results. Use obfuscation if your data is sensitive and internal policies don't allow it to leave the production environment.
- Dont make it read-only with some frozen data.
- Dont make it write-only with just copying data without building some typical reports.
- Dont wipe it clean instead of applying schema migrations.
- Use a sample of real production data and queries. Try to choose a sample thats still representative and makes `SELECT` queries return reasonable results. Use obfuscation if your data is sensitive and internal policies dont allow it to leave the production environment.
- Make sure that pre-production is covered by your monitoring and alerting software the same way as your production environment does.
- If your production spans across multiple datacenters or regions, make your pre-production does the same.
- If your production uses complex features like replication, distributed table, cascading materialize views, make sure they are configured similarly in pre-production.
- There's a trade-off on using the roughly same number of servers or VMs in pre-production as in production, but of smaller size, or much less of them, but of the same size. The first option might catch extra network-related issues, while the latter is easier to manage.
- Theres a trade-off on using the roughly same number of servers or VMs in pre-production as in production, but of smaller size, or much less of them, but of the same size. The first option might catch extra network-related issues, while the latter is easier to manage.
The second area to invest in is **automated testing infrastructure**. Don't assume that if some kind of query has executed successfully once, it'll continue to do so forever. It's ok to have some unit tests where ClickHouse is mocked but make sure your product has a reasonable set of automated tests that are run against real ClickHouse and check that all important use cases are still working as expected.
The second area to invest in is **automated testing infrastructure**. Dont assume that if some kind of query has executed successfully once, itll continue to do so forever. Its ok to have some unit tests where ClickHouse is mocked but make sure your product has a reasonable set of automated tests that are run against real ClickHouse and check that all important use cases are still working as expected.
Extra step forward could be contributing those automated tests to [ClickHouse's open-source test infrastructure](https://github.com/ClickHouse/ClickHouse/tree/master/tests) that's continuously used in its day-to-day development. It definitely will take some additional time and effort to learn [how to run it](../../development/tests.md) and then how to adapt your tests to this framework, but it'll pay off by ensuring that ClickHouse releases are already tested against them when they are announced stable, instead of repeatedly losing time on reporting the issue after the fact and then waiting for a bugfix to be implemented, backported and released. Some companies even have such test contributions to infrastructure by its use as an internal policy, most notably it's called [Beyonce's Rule](https://www.oreilly.com/library/view/software-engineering-at/9781492082781/ch01.html#policies_that_scale_well) at Google.
Extra step forward could be contributing those automated tests to [ClickHouses open-source test infrastructure](https://github.com/ClickHouse/ClickHouse/tree/master/tests) thats continuously used in its day-to-day development. It definitely will take some additional time and effort to learn [how to run it](../../development/tests.md) and then how to adapt your tests to this framework, but itll pay off by ensuring that ClickHouse releases are already tested against them when they are announced stable, instead of repeatedly losing time on reporting the issue after the fact and then waiting for a bugfix to be implemented, backported and released. Some companies even have such test contributions to infrastructure by its use as an internal policy, most notably its called [Beyonces Rule](https://www.oreilly.com/library/view/software-engineering-at/9781492082781/ch01.html#policies_that_scale_well) at Google.
When you have your pre-production environment and testing infrastructure in place, choosing the best version is straightforward:
@ -41,11 +41,11 @@ When you have your pre-production environment and testing infrastructure in plac
3. Report any issues you discovered to [ClickHouse GitHub Issues](https://github.com/ClickHouse/ClickHouse/issues).
4. If there were no major issues, it should be safe to start deploying ClickHouse release to your production environment. Investing in gradual release automation that implements an approach similar to [canary releases](https://martinfowler.com/bliki/CanaryRelease.html) or [green-blue deployments](https://martinfowler.com/bliki/BlueGreenDeployment.html) might further reduce the risk of issues in production.
As you might have noticed, there's nothing specific to ClickHouse in the approach described above, people do that for any piece of infrastructure they rely on if they take their production environment seriously.
As you might have noticed, theres nothing specific to ClickHouse in the approach described above, people do that for any piece of infrastructure they rely on if they take their production environment seriously.
## How to Choose Between ClickHouse Releases?
## How to Choose Between ClickHouse Releases? {#how-to-choose-between-clickhouse-releases}
If you look into contents of ClickHouse package repository, you'll see four kinds of packages:
If you look into contents of ClickHouse package repository, youll see four kinds of packages:
1. `testing`
2. `prestable`
@ -60,10 +60,10 @@ For production use, there are two key options: `stable` and `lts`. Here is some
- `stable` is the kind of package we recommend by default. They are released roughly monthly (and thus provide new features with reasonable delay) and three latest stable releases are supported in terms of diagnostics and backporting of bugfixes.
- `lts` are released twice a year and are supported for a year after their initial release. You might prefer them over `stable` in the following cases:
- Your company has some internal policies that don't allow for frequent upgrades or using non-LTS software.
- You are using ClickHouse in some secondary products that either doesn't require any complex ClickHouse features and don't have enough resources to keep it updated.
- Your company has some internal policies that dont allow for frequent upgrades or using non-LTS software.
- You are using ClickHouse in some secondary products that either doesnt require any complex ClickHouse features and dont have enough resources to keep it updated.
Many teams who initially thought that `lts` is the way to go, often switch to `stable` anyway because of some recent feature that's important for their product.
Many teams who initially thought that `lts` is the way to go, often switch to `stable` anyway because of some recent feature thats important for their product.
!!! warning "Important"
One more thing to keep in mind when upgrading ClickHouse: we're always keeping eye on compatibility across releases, but sometimes it's not reasonable to keep and some minor details might change. So make sure you check the [changelog](../../whats-new/changelog/index.md) before upgrading to see if there are any notes about backward-incompatible changes.
One more thing to keep in mind when upgrading ClickHouse: were always keeping eye on compatibility across releases, but sometimes its not reasonable to keep and some minor details might change. So make sure you check the [changelog](../../whats-new/changelog/index.md) before upgrading to see if there are any notes about backward-incompatible changes.