Merge pull request #37790 from vdimir/doc-aspell

Spellcheck for the docs
2024-09-20 08:40:50 +00:00 · 2022-06-09 12:52:31 +02:00 · 2022-06-09 12:52:31 +02:00 · 203de0c352
commit 203de0c352
parent de41f45f27 62f60c1462
29 changed files with 616 additions and 78 deletions
--- a/docker/test/style/Dockerfile
+++ b/docker/test/style/Dockerfile
@ -8,6 +8,7 @@ ARG apt_archive="http://archive.ubuntu.com"
 RUN sed -i "s|http://archive.ubuntu.com|$apt_archive|g" /etc/apt/sources.list

 RUN apt-get update && env DEBIAN_FRONTEND=noninteractive apt-get install --yes \
+    aspell \
    curl \
    git \
    libxml2-utils \
--- a/docker/test/style/process_style_check_result.py
+++ b/docker/test/style/process_style_check_result.py
@ -18,6 +18,7 @@ def process_result(result_folder):
        ("typos", "typos_output.txt"),
        ("whitespaces", "whitespaces_output.txt"),
        ("workflows", "workflows_output.txt"),
+        ("doc typos", "doc_spell_output.txt"),
    )

    for name, out_file in checks:
--- a/docker/test/style/run.sh
+++ b/docker/test/style/run.sh
@ -11,6 +11,8 @@ echo "Check python formatting with black" | ts
 ./check-black -n              |& tee /test_output/black_output.txt
 echo "Check typos" | ts
 ./check-typos                 |& tee /test_output/typos_output.txt
+echo "Check docs spelling" | ts
+./check-doc-aspell            |& tee /test_output/doc_spell_output.txt
 echo "Check whitespaces" | ts
 ./check-whitespaces -n        |& tee /test_output/whitespaces_output.txt
 echo "Check workflows" | ts
--- a/docs/en/development/adding_test_queries.md
+++ b/docs/en/development/adding_test_queries.md
@ -138,7 +138,7 @@ It's important to name tests correctly, so one could turn some tests subset off

 | Tester flag| What should be in test name | When flag should be added |
 |---|---|---|---|
-| `--[no-]zookeeper`| "zookeeper" or "replica" | Test uses tables from ReplicatedMergeTree family |
+| `--[no-]zookeeper`| "zookeeper" or "replica" | Test uses tables from `ReplicatedMergeTree` family |
 | `--[no-]shard` | "shard" or "distributed" or "global"| Test using connections to 127.0.0.2 or similar |
 | `--[no-]long` | "long" or "deadlock" or "race" | Test runs longer than 60 seconds |

--- a/docs/en/development/architecture.md
+++ b/docs/en/development/architecture.md
@ -5,7 +5,7 @@ sidebar_position: 62

 # Overview of ClickHouse Architecture

-ClickHouse is a true column-oriented DBMS. Data is stored by columns, and during the execution of arrays (vectors or chunks of columns). 
+ClickHouse is a true column-oriented DBMS. Data is stored by columns, and during the execution of arrays (vectors or chunks of columns).
 Whenever possible, operations are dispatched on arrays, rather than on individual values. It is called “vectorized query execution” and it helps lower the cost of actual data processing.

 > This idea is nothing new. It dates back to the `APL` (A programming language, 1957) and its descendants: `A +` (APL dialect), `J` (1990), `K` (1993), and `Q` (programming language from Kx Systems, 2003). Array programming is used in scientific data processing. Neither is this idea something new in relational databases: for example, it is used in the `VectorWise` system (also known as Actian Vector Analytic Database by Actian Corporation).
@ -149,13 +149,13 @@ The server implements several different interfaces:
 -   A TCP interface for the native ClickHouse client and for cross-server communication during distributed query execution.
 -   An interface for transferring data for replication.

-Internally, it is just a primitive multithreaded server without coroutines or fibers. Since the server is not designed to process a high rate of simple queries but to process a relatively low rate of complex queries, each of them can process a vast amount of data for analytics.
+Internally, it is just a primitive multithread server without coroutines or fibers. Since the server is not designed to process a high rate of simple queries but to process a relatively low rate of complex queries, each of them can process a vast amount of data for analytics.

 The server initializes the `Context` class with the necessary environment for query execution: the list of available databases, users and access rights, settings, clusters, the process list, the query log, and so on. Interpreters use this environment.

 We maintain full backward and forward compatibility for the server TCP protocol: old clients can talk to new servers, and new clients can talk to old servers. But we do not want to maintain it eternally, and we are removing support for old versions after about one year.

-:::note    
+:::note
 For most external applications, we recommend using the HTTP interface because it is simple and easy to use. The TCP protocol is more tightly linked to internal data structures: it uses an internal format for passing blocks of data, and it uses custom framing for compressed data. We haven’t released a C library for that protocol because it requires linking most of the ClickHouse codebase, which is not practical.
 :::

@ -178,7 +178,7 @@ To execute queries and do side activities ClickHouse allocates threads from one

 Server pool is a `Poco::ThreadPool` class instance defined in `Server::main()` method. It can have at most `max_connection` threads. Every thread is dedicated to a single active connection.

-Global thread pool is `GlobalThreadPool` singleton class. To allocate thread from it `ThreadFromGlobalPool` is used. It has an interface similar to `std::thread`, but pulls thread from the global pool and does all necessary initializations. It is configured with the following settings:
+Global thread pool is `GlobalThreadPool` singleton class. To allocate thread from it `ThreadFromGlobalPool` is used. It has an interface similar to `std::thread`, but pulls thread from the global pool and does all necessary initialization. It is configured with the following settings:
  * `max_thread_pool_size` - limit on thread count in pool.
  * `max_thread_pool_free_size` - limit on idle thread count waiting for new jobs.
  * `thread_pool_queue_size` - limit on scheduled job count.
@ -189,7 +189,7 @@ IO thread pool is implemented as a plain `ThreadPool` accessible via `IOThreadPo

 For periodic task execution there is `BackgroundSchedulePool` class. You can register tasks using `BackgroundSchedulePool::TaskHolder` objects and the pool ensures that no task runs two jobs at the same time. It also allows you to postpone task execution to a specific instant in the future or temporarily deactivate task. Global `Context` provides a few instances of this class for different purposes. For general purpose tasks `Context::getSchedulePool()` is used.

-There are also specialized thread pools for preemptable tasks. Such `IExecutableTask` task can be split into ordered sequence of jobs, called steps. To schedule these tasks in a manner allowing short tasks to be prioritied over long ones `MergeTreeBackgroundExecutor` is used. As name suggests it is used for background MergeTree related operations such as merges, mutations, fetches and moves. Pool instances are available using `Context::getCommonExecutor()` and other similar methods.
+There are also specialized thread pools for preemptable tasks. Such `IExecutableTask` task can be split into ordered sequence of jobs, called steps. To schedule these tasks in a manner allowing short tasks to be prioritized over long ones `MergeTreeBackgroundExecutor` is used. As name suggests it is used for background MergeTree related operations such as merges, mutations, fetches and moves. Pool instances are available using `Context::getCommonExecutor()` and other similar methods.

 No matter what pool is used for a job, at start `ThreadStatus` instance is created for this job. It encapsulates all per-thread information: thread id, query id, performance counters, resource consumption and many other useful data. Job can access it via thread local pointer by `CurrentThread::get()` call, so we do not need to pass it to every function.

@ -201,7 +201,7 @@ Servers in a cluster setup are mostly independent. You can create a `Distributed

 Things become more complicated when you have subqueries in IN or JOIN clauses, and each of them uses a `Distributed` table. We have different strategies for the execution of these queries.

-There is no global query plan for distributed query execution. Each node has its local query plan for its part of the job. We only have simple one-pass distributed query execution: we send queries for remote nodes and then merge the results. But this is not feasible for complicated queries with high cardinality GROUP BYs or with a large amount of temporary data for JOIN. In such cases, we need to “reshuffle” data between servers, which requires additional coordination. ClickHouse does not support that kind of query execution, and we need to work on it.
+There is no global query plan for distributed query execution. Each node has its local query plan for its part of the job. We only have simple one-pass distributed query execution: we send queries for remote nodes and then merge the results. But this is not feasible for complicated queries with high cardinality `GROUP BY`s or with a large amount of temporary data for JOIN. In such cases, we need to “reshuffle” data between servers, which requires additional coordination. ClickHouse does not support that kind of query execution, and we need to work on it.

 ## Merge Tree {#merge-tree}

@ -231,7 +231,7 @@ Replication is physical: only compressed parts are transferred between nodes, no

 Besides, each replica stores its state in ZooKeeper as the set of parts and its checksums. When the state on the local filesystem diverges from the reference state in ZooKeeper, the replica restores its consistency by downloading missing and broken parts from other replicas. When there is some unexpected or broken data in the local filesystem, ClickHouse does not remove it, but moves it to a separate directory and forgets it.

-:::note    
+:::note
 The ClickHouse cluster consists of independent shards, and each shard consists of replicas. The cluster is **not elastic**, so after adding a new shard, data is not rebalanced between shards automatically. Instead, the cluster load is supposed to be adjusted to be uneven. This implementation gives you more control, and it is ok for relatively small clusters, such as tens of nodes. But for clusters with hundreds of nodes that we are using in production, this approach becomes a significant drawback. We should implement a table engine that spans across the cluster with dynamically replicated regions that could be split and balanced between clusters automatically.
 :::

--- a/docs/en/development/build-osx.md
+++ b/docs/en/development/build-osx.md
@ -4,7 +4,7 @@ sidebar_label: Build on Mac OS X
 description: How to build ClickHouse on Mac OS X
 ---

-# How to Build ClickHouse on Mac OS X 
+# How to Build ClickHouse on Mac OS X

 :::info You don't have to build ClickHouse yourself!
 You can install pre-built ClickHouse as described in [Quick Start](https://clickhouse.com/#quick-start). Follow **macOS (Intel)** or **macOS (Apple silicon)** installation instructions.
@ -20,9 +20,9 @@ It is also possible to compile with Apple's XCode `apple-clang` or Homebrew's `g

 First install [Homebrew](https://brew.sh/)

-## For Apple's Clang (discouraged): Install Xcode and Command Line Tools {#install-xcode-and-command-line-tools}
+## For Apple's Clang (discouraged): Install XCode and Command Line Tools {#install-xcode-and-command-line-tools}

-Install the latest [Xcode](https://apps.apple.com/am/app/xcode/id497799835?mt=12) from App Store.
+Install the latest [XCode](https://apps.apple.com/am/app/xcode/id497799835?mt=12) from App Store.

 Open it at least once to accept the end-user license agreement and automatically install the required components.

@ -62,7 +62,7 @@ cmake --build build
 # The resulting binary will be created at: build/programs/clickhouse
 ```

-To build using Xcode's native AppleClang compiler in Xcode IDE (this option is only for development builds and workflows, and is **not recommended** unless you know what you are doing):
+To build using XCode native AppleClang compiler in XCode IDE (this option is only for development builds and workflows, and is **not recommended** unless you know what you are doing):

 ``` bash
 cd ClickHouse
@ -71,7 +71,7 @@ mkdir build
 cd build
 XCODE_IDE=1 ALLOW_APPLECLANG=1 cmake -G Xcode -DCMAKE_BUILD_TYPE=Debug -DENABLE_JEMALLOC=OFF ..
 cmake --open .
-# ...then, in Xcode IDE select ALL_BUILD scheme and start the building process.
+# ...then, in XCode IDE select ALL_BUILD scheme and start the building process.
 # The resulting binary will be created at: ./programs/Debug/clickhouse
 ```

@ -91,9 +91,9 @@ cmake --build build

 ## Caveats {#caveats}

-If you intend to run `clickhouse-server`, make sure to increase the system’s maxfiles variable.
+If you intend to run `clickhouse-server`, make sure to increase the system’s `maxfiles` variable.

-:::note    
+:::note
 You’ll need to use sudo.
 :::

--- a/docs/en/development/build.md
+++ b/docs/en/development/build.md
@ -130,7 +130,7 @@ Here is an example of how to install the new `cmake` from the official website:
 ```
 wget https://github.com/Kitware/CMake/releases/download/v3.22.2/cmake-3.22.2-linux-x86_64.sh
 chmod +x cmake-3.22.2-linux-x86_64.sh
-./cmake-3.22.2-linux-x86_64.sh 
+./cmake-3.22.2-linux-x86_64.sh
 export PATH=/home/milovidov/work/cmake-3.22.2-linux-x86_64/bin/:${PATH}
 hash cmake
 ```
@ -163,7 +163,7 @@ ClickHouse is available in pre-built binaries and packages. Binaries are portabl

 They are built for stable, prestable and testing releases as long as for every commit to master and for every pull request.

-To find the freshest build from `master`, go to [commits page](https://github.com/ClickHouse/ClickHouse/commits/master), click on the first green checkmark or red cross near commit, and click to the “Details” link right after “ClickHouse Build Check”.
+To find the freshest build from `master`, go to [commits page](https://github.com/ClickHouse/ClickHouse/commits/master), click on the first green check mark or red cross near commit, and click to the “Details” link right after “ClickHouse Build Check”.

 ## Faster builds for development: Split build configuration {#split-build}

--- a/docs/en/development/cmake-in-clickhouse.md
+++ b/docs/en/development/cmake-in-clickhouse.md
@ -19,7 +19,7 @@ cmake .. \

 ## CMake files types

-1. ClickHouse's source CMake files (located in the root directory and in /src).
+1. ClickHouse source CMake files (located in the root directory and in /src).
 2. Arch-dependent CMake files (located in /cmake/*os_name*).
 3. Libraries finders (search for contrib libraries, located in /contrib/*/CMakeLists.txt).
 4. Contrib build CMake files (used instead of libraries' own CMake files, located in /cmake/modules)
@ -456,7 +456,7 @@ option(ENABLE_TESTS "Provide unit_test_dbms target with Google.test unit tests"

 #### If the option's state could produce unwanted (or unusual) result, explicitly warn the user.

-Suppose you have an option that may strip debug symbols from the ClickHouse's part.
+Suppose you have an option that may strip debug symbols from the ClickHouse part.
 This can speed up the linking process, but produces a binary that cannot be debugged.
 In that case, prefer explicitly raising a warning telling the developer that he may be doing something wrong.
 Also, such options should be disabled if applies.
--- a/docs/en/development/continuous-integration.md
+++ b/docs/en/development/continuous-integration.md
@ -31,7 +31,7 @@ If you are not sure what to do, ask a maintainer for help.
 ## Merge With Master

 Verifies that the PR can be merged to master. If not, it will fail with the
-message 'Cannot fetch mergecommit'. To fix this check, resolve the conflict as
+message `Cannot fetch mergecommit`. To fix this check, resolve the conflict as
 described in the [GitHub
 documentation](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/resolving-a-merge-conflict-on-github),
 or merge the `master` branch to your pull request branch using git.
@ -57,7 +57,7 @@ You have to specify a changelog category for your change (e.g., Bug Fix), and
 write a user-readable message describing the change for [CHANGELOG.md](../whats-new/changelog/)


-## Push To Dockerhub
+## Push To DockerHub

 Builds docker images used for build and tests, then pushes them to DockerHub.

@ -118,7 +118,7 @@ Builds ClickHouse in various configurations for use in further steps. You have t
 - **Compiler**: `gcc-9` or `clang-10` (or `clang-10-xx` for other architectures e.g. `clang-10-freebsd`).
 - **Build type**: `Debug` or `RelWithDebInfo` (cmake).
 - **Sanitizer**: `none` (without sanitizers), `address` (ASan), `memory` (MSan), `undefined` (UBSan), or `thread` (TSan).
- **Splitted** `splitted` is a [split build](../development/build.md#split-build)
+- **Split** `splitted` is a [split build](../development/build.md#split-build)
 - **Status**: `success` or `fail`
 - **Build log**: link to the building and files copying log, useful when build failed.
 - **Build time**.
--- a/docs/en/development/contrib.md
+++ b/docs/en/development/contrib.md
@ -96,9 +96,9 @@ SELECT library_name, license_type, license_path FROM system.licenses ORDER BY li

 ## Adding new third-party libraries and maintaining patches in third-party libraries {#adding-third-party-libraries}

-1. Each third-party libary must reside in a dedicated directory under the `contrib/` directory of the ClickHouse repository. Avoid dumps/copies of external code, instead use Git's submodule feature to pull third-party code from an external upstream repository.
-2. Submodules are listed in `.gitmodule`. If the external library can be used as-is, you may reference the upstream repository directly. Otherwise, i.e. the external libary requires patching/customization, create a fork of the official repository in the [Clickhouse organization in GitHub](https://github.com/ClickHouse).
+1. Each third-party library must reside in a dedicated directory under the `contrib/` directory of the ClickHouse repository. Avoid dumps/copies of external code, instead use Git submodule feature to pull third-party code from an external upstream repository.
+2. Submodules are listed in `.gitmodule`. If the external library can be used as-is, you may reference the upstream repository directly. Otherwise, i.e. the external library requires patching/customization, create a fork of the official repository in the [Clickhouse organization in GitHub](https://github.com/ClickHouse).
 3. In the latter case, create a branch with `clickhouse/` prefix from the branch you want to integrate, e.g. `clickhouse/master` (for `master`) or `clickhouse/release/vX.Y.Z` (for a `release/vX.Y.Z` tag). The purpose of this branch is to isolate customization of the library from upstream work. For example, pulls from the upstream repository into the fork will leave all `clickhouse/` branches unaffected. Submodules in `contrib/` must only track `clickhouse/` branches of forked third-party repositories.
 4. To patch a fork of a third-party library, create a dedicated branch with `clickhouse/` prefix in the fork, e.g. `clickhouse/fix-some-desaster`. Finally, merge the patch branch into the custom tracking branch (e.g. `clickhouse/master` or `clickhouse/release/vX.Y.Z`) using a PR.
-5. Always create patches of third-party libraries with the official repository in mind. Once a PR of a patch branch to the `clickhouse/` branch in the fork repository is done and the submodule version in ClickHouse's official repository is bumped, consider opening another PR from the patch branch to the upstream library repository. This ensures, that 1) the contribution has more than a single use case and importance, 2) others will also benefit from it, 3) the change will not remain a maintenance burden solely on ClickHouse developers.
+5. Always create patches of third-party libraries with the official repository in mind. Once a PR of a patch branch to the `clickhouse/` branch in the fork repository is done and the submodule version in ClickHouse official repository is bumped, consider opening another PR from the patch branch to the upstream library repository. This ensures, that 1) the contribution has more than a single use case and importance, 2) others will also benefit from it, 3) the change will not remain a maintenance burden solely on ClickHouse developers.
 9. To update a submodule with changes in the upstream repository, first merge upstream `master` (or a new `versionX.Y.Z` tag) into the `clickhouse`-tracking branch in the fork repository. Conflicts with patches/customization will need to be resolved in this merge (see Step 4.). Once the merge is done, bump the submodule in ClickHouse to point to the new hash in the fork.
--- a/docs/en/development/developer-instruction.md
+++ b/docs/en/development/developer-instruction.md
@ -70,7 +70,7 @@ You can also clone the repository via https protocol:

 This, however, will not let you send your changes to the server. You can still use it temporarily and add the SSH keys later replacing the remote address of the repository with `git remote` command.

-You can also add original ClickHouse repo’s address to your local repository to pull updates from there:
+You can also add original ClickHouse repo address to your local repository to pull updates from there:

    git remote add upstream git@github.com:ClickHouse/ClickHouse.git

@ -177,7 +177,7 @@ If you require to build all the binaries (utilities and tests), you should run n

 Full build requires about 30GB of free disk space or 15GB to build the main binaries.

-When a large amount of RAM is available on build machine you should limit the number of build tasks run in parallel with `-j` param:
+When a large amount of RAM is available on build machine you should limit the number of build tasks run in parallel with `-j` parameter:

    ninja -j 1 clickhouse-server clickhouse-client

@ -269,7 +269,7 @@ Developing ClickHouse often requires loading realistic datasets. It is particula

 Navigate to your fork repository in GitHub’s UI. If you have been developing in a branch, you need to select that branch. There will be a “Pull request” button located on the screen. In essence, this means “create a request for accepting my changes into the main repository”.

-A pull request can be created even if the work is not completed yet. In this case please put the word “WIP” (work in progress) at the beginning of the title, it can be changed later. This is useful for cooperative reviewing and discussion of changes as well as for running all of the available tests. It is important that you provide a brief description of your changes, it will later be used for generating release changelogs.
+A pull request can be created even if the work is not completed yet. In this case please put the word “WIP” (work in progress) at the beginning of the title, it can be changed later. This is useful for cooperative reviewing and discussion of changes as well as for running all of the available tests. It is important that you provide a brief description of your changes, it will later be used for generating release changelog.

 Testing will commence as soon as ClickHouse employees label your PR with a tag “can be tested”. The results of some first checks (e.g. code style) will come in within several minutes. Build check results will arrive within half an hour. And the main set of tests will report itself within an hour.

--- a/docs/en/development/integrating_rust_libraries.md
+++ b/docs/en/development/integrating_rust_libraries.md
@ -2,7 +2,7 @@

 Rust library integration will be described based on BLAKE3 hash-function integration.

-The first step is forking a library and making neccessary changes for Rust and C/C++ compatibility.
+The first step is forking a library and making necessary changes for Rust and C/C++ compatibility.

 After forking library repository you need to change target settings in Cargo.toml file. Firstly, you need to switch build to static library. Secondly, you need to add cbindgen crate to the crate list. We will use it later to generate C-header automatically.

@ -51,9 +51,9 @@ pub unsafe extern "C" fn blake3_apply_shim(
 }
 ```

-This method gets C-compatible string, its size and output string pointer as input. Then, it converts C-compatible inputs into types that are used by actual library methods and calls them. After that, it should convert library methods' outputs back into C-compatible type. In that particular case library supported direct writing into pointer by method fill(), so the convertion was not needed. The main advice here is to create less methods, so you will need to do less convertions on each method call and won't create much overhead.
+This method gets C-compatible string, its size and output string pointer as input. Then, it converts C-compatible inputs into types that are used by actual library methods and calls them. After that, it should convert library methods' outputs back into C-compatible type. In that particular case library supported direct writing into pointer by method fill(), so the conversion was not needed. The main advice here is to create less methods, so you will need to do less conversions on each method call and won't create much overhead.

-Also, you should use attribute #[no_mangle] and extern "C" for every C-compatible attribute. Without it library can compile incorrectly and cbindgen won't launch header autogeneration.
+Also, you should use attribute #[no_mangle] and `extern "C"` for every C-compatible attribute. Without it library can compile incorrectly and cbindgen won't launch header autogeneration.

 After all these steps you can test your library in a small project to find all problems with compatibility or header generation. If any problems occur during header generation, you can try to configure it with cbindgen.toml file (you can find an example of it in BLAKE3 directory or a template here: [https://github.com/eqrion/cbindgen/blob/master/template.toml](https://github.com/eqrion/cbindgen/blob/master/template.toml)). If everything works correctly, you can finally integrate its methods into ClickHouse.

--- a/docs/en/development/style.md
+++ b/docs/en/development/style.md
@ -4,7 +4,7 @@ sidebar_label: C++ Guide
 description: A list of recommendations regarding coding style, naming convention, formatting and more
 ---

-# How to Write C++ Code 
+# How to Write C++ Code

 ## General Recommendations {#general-recommendations}

@ -196,7 +196,7 @@ std::cerr << static_cast<int>(c) << std::endl;

 The same is true for small methods in any classes or structs.

-For templated classes and structs, do not separate the method declarations from the implementation (because otherwise they must be defined in the same translation unit).
+For template classes and structs, do not separate the method declarations from the implementation (because otherwise they must be defined in the same translation unit).

 **31.** You can wrap lines at 140 characters, instead of 80.

@ -285,7 +285,7 @@ Note: You can use Doxygen to generate documentation from these comments. But Dox
 /// WHAT THE FAIL???
 ```

-**14.** Do not use comments to make delimeters.
+**14.** Do not use comments to make delimiters.

 ``` cpp
 ///******************************************************
@ -491,7 +491,7 @@ if (0 != close(fd))
    throwFromErrno("Cannot close file " + file_name, ErrorCodes::CANNOT_CLOSE_FILE);
 ```

-You can use assert to check invariants in code.
+You can use assert to check invariant in code.

 **4.** Exception types.

@ -552,9 +552,9 @@ Do not try to implement lock-free data structures unless it is your primary area

 In most cases, prefer references.

-**10.** const.
+**10.** `const`.

-Use constant references, pointers to constants, `const_iterator`, and const methods.
+Use constant references, pointers to constants, `const_iterator`, and `const` methods.

 Consider `const` to be default and use non-`const` only when necessary.

@ -596,7 +596,7 @@ public:
    AggregateFunctionPtr get(const String & name, const DataTypes & argument_types) const;
 ```

-**15.** namespace.
+**15.** `namespace`.

 There is no need to use a separate `namespace` for application code.

@ -606,7 +606,7 @@ For medium to large libraries, put everything in a `namespace`.

 In the library’s `.h` file, you can use `namespace detail` to hide implementation details not needed for the application code.

-In a `.cpp` file, you can use a `static` or anonymous namespace to hide symbols.
+In a `.cpp` file, you can use a `static` or anonymous `namespace` to hide symbols.

 Also, a `namespace` can be used for an `enum` to prevent the corresponding names from falling into an external `namespace` (but it’s better to use an `enum class`).

--- a/docs/en/development/tests.md
+++ b/docs/en/development/tests.md
@ -4,7 +4,7 @@ sidebar_label: Testing
 description: Most of ClickHouse features can be tested with functional tests and they are mandatory to use for every change in ClickHouse code that can be tested that way.
 ---

-# ClickHouse Testing 
+# ClickHouse Testing

 ## Functional Tests

@ -85,7 +85,7 @@ Performance tests allow to measure and compare performance of some isolated part

 Each test run one or multiple queries (possibly with combinations of parameters) in a loop.

-If you want to improve performance of ClickHouse in some scenario, and if improvements can be observed on simple queries, it is highly recommended to write a performance test. It always makes sense to use `perf top` or other perf tools during your tests.
+If you want to improve performance of ClickHouse in some scenario, and if improvements can be observed on simple queries, it is highly recommended to write a performance test. It always makes sense to use `perf top` or other `perf` tools during your tests.

 ## Test Tools and Scripts {#test-tools-and-scripts}

@ -228,7 +228,7 @@ Our Security Team did some basic overview of ClickHouse capabilities from the se

 We run `clang-tidy` on per-commit basis. `clang-static-analyzer` checks are also enabled. `clang-tidy` is also used for some style checks.

-We have evaluated `clang-tidy`, `Coverity`, `cppcheck`, `PVS-Studio`, `tscancode`, `CodeQL`. You will find instructions for usage in `tests/instructions/` directory. 
+We have evaluated `clang-tidy`, `Coverity`, `cppcheck`, `PVS-Studio`, `tscancode`, `CodeQL`. You will find instructions for usage in `tests/instructions/` directory.

 If you use `CLion` as an IDE, you can leverage some `clang-tidy` checks out of the box.

@ -244,7 +244,7 @@ In debug build we also involve a customization of libc that ensures that no "har

 Debug assertions are used extensively.

-In debug build, if exception with "logical error" code (implies a bug) is being thrown, the program is terminated prematurally. It allows to use exceptions in release build but make it an assertion in debug build.
+In debug build, if exception with "logical error" code (implies a bug) is being thrown, the program is terminated prematurely. It allows to use exceptions in release build but make it an assertion in debug build.

 Debug version of jemalloc is used for debug builds.
 Debug version of libc++ is used for debug builds.
@ -253,7 +253,7 @@ Debug version of libc++ is used for debug builds.

 Data stored on disk is checksummed. Data in MergeTree tables is checksummed in three ways simultaneously* (compressed data blocks, uncompressed data blocks, the total checksum across blocks). Data transferred over network between client and server or between servers is also checksummed. Replication ensures bit-identical data on replicas.

-It is required to protect from faulty hardware (bit rot on storage media, bit flips in RAM on server, bit flips in RAM of network controller, bit flips in RAM of network switch, bit flips in RAM of client, bit flips on the wire). Note that bit flips are common and likely to occur even for ECC RAM and in presense of TCP checksums (if you manage to run thousands of servers processing petabytes of data each day). [See the video (russian)](https://www.youtube.com/watch?v=ooBAQIe0KlQ).
+It is required to protect from faulty hardware (bit rot on storage media, bit flips in RAM on server, bit flips in RAM of network controller, bit flips in RAM of network switch, bit flips in RAM of client, bit flips on the wire). Note that bit flips are common and likely to occur even for ECC RAM and in presence of TCP checksums (if you manage to run thousands of servers processing petabytes of data each day). [See the video (russian)](https://www.youtube.com/watch?v=ooBAQIe0KlQ).

 ClickHouse provides diagnostics that will help ops engineers to find faulty hardware.

--- a/docs/en/engines/table-engines/index.md
+++ b/docs/en/engines/table-engines/index.md
@ -12,7 +12,7 @@ The table engine (type of table) determines:
 -   Which queries are supported, and how.
 -   Concurrent data access.
 -   Use of indexes, if present.
-   Whether multithreaded request execution is possible.
+-   Whether multithread request execution is possible.
 -   Data replication parameters.

 ## Engine Families {#engine-families}
--- a/docs/en/getting-started/install.md
+++ b/docs/en/getting-started/install.md
@ -190,8 +190,7 @@ sudo ./clickhouse install

 ### From Precompiled Binaries for Non-Standard Environments {#from-binaries-non-linux}

-For non-Linux operating systems and for AArch64 CPU arhitecture, ClickHouse builds are provided as a cross-compiled binary from the latest commit of the `master` branch (with a few hours delay).
-
+For non-Linux operating systems and for AArch64 CPU architecture, ClickHouse builds are provided as a cross-compiled binary from the latest commit of the `master` branch (with a few hours delay).

 -   [MacOS x86_64](https://builds.clickhouse.com/master/macos/clickhouse)
     ```bash
--- a/docs/en/interfaces/formats.md
+++ b/docs/en/interfaces/formats.md
@ -119,7 +119,7 @@ Dates with times are written in the format `YYYY-MM-DD hh:mm:ss` and parsed in t
 This all occurs in the system time zone at the time the client or server starts (depending on which of them formats data). For dates with times, daylight saving time is not specified. So if a dump has times during daylight saving time, the dump does not unequivocally match the data, and parsing will select one of the two times.
 During a read operation, incorrect dates and dates with times can be parsed with natural overflow or as null dates and times, without an error message.

-As an exception, parsing dates with times is also supported in Unix timestamp format, if it consists of exactly 10 decimal digits. The result is not time zone-dependent. The formats YYYY-MM-DD hh:mm:ss and NNNNNNNNNN are differentiated automatically.
+As an exception, parsing dates with times is also supported in Unix timestamp format, if it consists of exactly 10 decimal digits. The result is not time zone-dependent. The formats `YYYY-MM-DD hh:mm:ss` and `NNNNNNNNNN` are differentiated automatically.

 Strings are output with backslash-escaped special characters. The following escape sequences are used for output: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\'`, `\\`. Parsing also supports the sequences `\a`, `\v`, and `\xHH` (hex escape sequences) and any `\c` sequences, where `c` is any character (these sequences are converted to `c`). Thus, reading data supports formats where a line feed can be written as `\n` or `\`, or as a line feed. For example, the string `Hello world` with a line feed between the words instead of space can be parsed in any of the following variations:

@ -816,7 +816,7 @@ Columns that are not present in the block will be filled with default values (yo

 ## JSONEachRow {#jsoneachrow}

-In this format, CliskHouse outputs each row as a separated, newline-delimited JSON Object.
+In this format, ClickHouse outputs each row as a separated, newline-delimited JSON Object.

 Example:

@ -1363,9 +1363,9 @@ Columns `name` ([String](../sql-reference/data-types/string.md)) and `value` (nu
 Rows may optionally contain `help` ([String](../sql-reference/data-types/string.md)) and `timestamp` (number).
 Column `type` ([String](../sql-reference/data-types/string.md)) is either `counter`, `gauge`, `histogram`, `summary`, `untyped` or empty.
 Each metric value may also have some `labels` ([Map(String, String)](../sql-reference/data-types/map.md)).
-Several consequent rows may refer to the one metric with different lables. The table should be sorted by metric name (e.g., with `ORDER BY name`).
+Several consequent rows may refer to the one metric with different labels. The table should be sorted by metric name (e.g., with `ORDER BY name`).

-There's special requirements for labels for `histogram` and `summary`, see [Prometheus doc](https://prometheus.io/docs/instrumenting/exposition_formats/#histograms-and-summaries) for the details. Special rules applied to row with labels `{'count':''}` and `{'sum':''}`, they'll be convered to `<metric_name>_count` and `<metric_name>_sum` respectively.
+There's special requirements for labels for `histogram` and `summary`, see [Prometheus doc](https://prometheus.io/docs/instrumenting/exposition_formats/#histograms-and-summaries) for the details. Special rules applied to row with labels `{'count':''}` and `{'sum':''}`, they'll be converted to `<metric_name>_count` and `<metric_name>_sum` respectively.

 **Example:**

@ -1665,7 +1665,7 @@ To exchange data with Hadoop, you can use [HDFS table engine](../engines/table-e

 ### Parquet format settings {#parquet-format-settings}

- [output_format_parquet_row_group_size](../operations/settings/settings.md#output_format_parquet_row_group_size) - row group size in rows while data output. Default value - `1000000`.  
+- [output_format_parquet_row_group_size](../operations/settings/settings.md#output_format_parquet_row_group_size) - row group size in rows while data output. Default value - `1000000`.
 - [output_format_parquet_string_as_string](../operations/settings/settings.md#output_format_parquet_string_as_string) - use Parquet String type instead of Binary for String columns. Default value - `false`.
 - [input_format_parquet_import_nested](../operations/settings/settings.md#input_format_parquet_import_nested) - allow inserting array of structs into [Nested](../sql-reference/data-types/nested-data-structures/nested.md) table in Parquet input format. Default value - `false`.
 - [input_format_parquet_case_insensitive_column_matching](../operations/settings/settings.md#input_format_parquet_case_insensitive_column_matching) - ignore case when matching Parquet columns with ClickHouse columns. Default value - `false`.
@ -1845,7 +1845,7 @@ When working with the `Regexp` format, you can use the following settings:
    -   Quoted (similarly to [Values](#data-format-values))
    -   Raw (extracts subpatterns as a whole, no escaping rules, similarly to [TSVRaw](#tabseparatedraw))

-   `format_regexp_skip_unmatched` — [UInt8](../sql-reference/data-types/int-uint.md). Defines the need to throw an exeption in case the `format_regexp` expression does not match the imported data. Can be set to `0` or `1`.
+-   `format_regexp_skip_unmatched` — [UInt8](../sql-reference/data-types/int-uint.md). Defines the need to throw an exception in case the `format_regexp` expression does not match the imported data. Can be set to `0` or `1`.

 **Usage**

--- a/docs/en/interfaces/http.md
+++ b/docs/en/interfaces/http.md
@ -422,7 +422,7 @@ Now `rule` can configure `method`, `headers`, `url`, `handler`:

    -   `query` — use with `predefined_query_handler` type, executes query when the handler is called.

-    -   `query_param_name` — use with `dynamic_query_handler` type, extracts and executes the value corresponding to the `query_param_name` value in HTTP request params.
+    -   `query_param_name` — use with `dynamic_query_handler` type, extracts and executes the value corresponding to the `query_param_name` value in HTTP request parameters.

    -   `status` — use with `static` type, response status code.

@ -477,9 +477,9 @@ In one `predefined_query_handler` only supports one `query` of an insert type.

 ### dynamic_query_handler {#dynamic_query_handler}

-In `dynamic_query_handler`, the query is written in the form of param of the HTTP request. The difference is that in `predefined_query_handler`, the query is written in the configuration file. You can configure `query_param_name` in `dynamic_query_handler`.
+In `dynamic_query_handler`, the query is written in the form of parameter of the HTTP request. The difference is that in `predefined_query_handler`, the query is written in the configuration file. You can configure `query_param_name` in `dynamic_query_handler`.

-ClickHouse extracts and executes the value corresponding to the `query_param_name` value in the URL of the HTTP request. The default value of `query_param_name` is `/query` . It is an optional configuration. If there is no definition in the configuration file, the param is not passed in.
+ClickHouse extracts and executes the value corresponding to the `query_param_name` value in the URL of the HTTP request. The default value of `query_param_name` is `/query` . It is an optional configuration. If there is no definition in the configuration file, the parameter is not passed in.

 To experiment with this functionality, the example defines the values of [max_threads](../operations/settings/settings.md#settings-max_threads) and `max_final_threads` and `queries` whether the settings were set successfully.

--- a/docs/en/interfaces/postgresql.md
+++ b/docs/en/interfaces/postgresql.md
@ -5,7 +5,7 @@ sidebar_label: PostgreSQL Interface

 # PostgreSQL Interface

-ClickHouse supports the PostgreSQL wire protocol, which allows you to use Postgres clients to connect to ClickHouse. In a sense, ClickHouse can pretend to be a PostgreSQL instance - allowing you to connect a PostgreSQL client application to ClickHouse that is not already directy supported by ClickHouse (for example, Amazon Redshift).
+ClickHouse supports the PostgreSQL wire protocol, which allows you to use Postgres clients to connect to ClickHouse. In a sense, ClickHouse can pretend to be a PostgreSQL instance - allowing you to connect a PostgreSQL client application to ClickHouse that is not already directly supported by ClickHouse (for example, Amazon Redshift).

 To enable the PostgreSQL wire protocol, add the [postgresql_port](../operations/server-configuration-parameters/settings#server_configuration_parameters-postgresql_port) setting to your server's configuration file. For example, you could define the port in a new XML file in your `config.d` folder:

@ -59,7 +59,7 @@ The PostgreSQL protocol currently only supports plain-text passwords.

 ## Using SSL

-If you have SSL/TLS configured on your ClickHouse instance, then `postgresql_port` will use the same settings (the port is shared for both secure and unsecure clients).
+If you have SSL/TLS configured on your ClickHouse instance, then `postgresql_port` will use the same settings (the port is shared for both secure and insecure clients).

 Each client has their own method of how to connect using SSL. The following command demonstrates how to pass in the certificates and key to securely connect `psql` to ClickHouse:

--- a/docs/en/operations/clickhouse-keeper.md
+++ b/docs/en/operations/clickhouse-keeper.md
@ -53,7 +53,7 @@ Internal coordination settings are located in the `<keeper_server>.<coordination
 -    `auto_forwarding` — Allow to forward write requests from followers to the leader (default: true).
 -    `shutdown_timeout` — Wait to finish internal connections and shutdown (ms) (default: 5000).
 -    `startup_timeout` — If the server doesn't connect to other quorum participants in the specified timeout it will terminate (ms) (default: 30000).
-    `four_letter_word_white_list` — White list of 4lw commands (default: "conf,cons,crst,envi,ruok,srst,srvr,stat,wchc,wchs,dirs,mntr,isro").
+-    `four_letter_word_white_list` — White list of 4lw commands (default: `conf,cons,crst,envi,ruok,srst,srvr,stat,wchc,wchs,dirs,mntr,isro`).

 Quorum configuration is located in the `<keeper_server>.<raft_configuration>` section and contain servers description.

@ -122,7 +122,7 @@ clickhouse keeper --config /etc/your_path_to_config/config.xml

 ClickHouse Keeper also provides 4lw commands which are almost the same with Zookeeper. Each command is composed of four letters such as `mntr`, `stat` etc. There are some more interesting commands: `stat` gives some general information about the server and connected clients, while `srvr` and `cons` give extended details on server and connections respectively.

-The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value "conf,cons,crst,envi,ruok,srst,srvr,stat,wchc,wchs,dirs,mntr,isro".
+The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value `conf,cons,crst,envi,ruok,srst,srvr,stat,wchc,wchs,dirs,mntr,isro`.

 You can issue the commands to ClickHouse Keeper via telnet or nc, at the client port.

@ -132,7 +132,7 @@ echo mntr | nc localhost 9181

 Bellow is the detailed 4lw commands:

- `ruok`: Tests if server is running in a non-error state. The server will respond with imok if it is running. Otherwise it will not respond at all. A response of "imok" does not necessarily indicate that the server has joined the quorum, just that the server process is active and bound to the specified client port. Use "stat" for details on state wrt quorum and client connection information.
+- `ruok`: Tests if server is running in a non-error state. The server will respond with `imok` if it is running. Otherwise it will not respond at all. A response of `imok` does not necessarily indicate that the server has joined the quorum, just that the server process is active and bound to the specified client port. Use "stat" for details on state wrt quorum and client connection information.

 ```
 imok
@ -330,9 +330,9 @@ E.g. for a 3-node cluster, it will continue working correctly if only 1 node cra

 Cluster configuration can be dynamically configured but there are some limitations. Reconfiguration relies on Raft also
 so to add/remove a node from the cluster you need to have a quorum. If you lose too many nodes in your cluster at the same time without any chance
-of starting them again, Raft will stop working and not allow you to reconfigure your cluster using the convenvtional way.
+of starting them again, Raft will stop working and not allow you to reconfigure your cluster using the conventional way.

-Nevertheless, Clickhouse Keeper has a recovery mode which allows you to forcfully reconfigure your cluster with only 1 node.
+Nevertheless, Clickhouse Keeper has a recovery mode which allows you to forcefully reconfigure your cluster with only 1 node.
 This should be done only as your last resort if you cannot start your nodes again, or start a new instance on the same endpoint.

 Important things to note before continuing:
--- a/docs/en/operations/configuration-files.md
+++ b/docs/en/operations/configuration-files.md
@ -57,7 +57,7 @@ Substitutions can also be performed from ZooKeeper. To do this, specify the attr

 The `config.xml` file can specify a separate config with user settings, profiles, and quotas. The relative path to this config is set in the `users_config` element. By default, it is `users.xml`. If `users_config` is omitted, the user settings, profiles, and quotas are specified directly in `config.xml`.

-Users configuration can be splitted into separate files similar to `config.xml` and `config.d/`.
+Users configuration can be split into separate files similar to `config.xml` and `config.d/`.
 Directory name is defined as `users_config` setting without `.xml` postfix concatenated with `.d`.
 Directory `users.d` is used by default, as `users_config` defaults to `users.xml`.

--- a/docs/en/operations/tips.md
+++ b/docs/en/operations/tips.md
@ -70,7 +70,7 @@ Regardless of RAID use, always use replication for data security.
 Enable NCQ with a long queue. For HDD, choose the CFQ scheduler, and for SSD, choose noop. Don’t reduce the ‘readahead’ setting.
 For HDD, enable the write cache.

-Make sure that [fstrim](https://en.wikipedia.org/wiki/Trim_(computing)) is enabled for NVME and SSD disks in your OS (usually it's implemented using a cronjob or systemd service).
+Make sure that [`fstrim`](https://en.wikipedia.org/wiki/Trim_(computing)) is enabled for NVME and SSD disks in your OS (usually it's implemented using a cronjob or systemd service).

 ## File System {#file-system}

@ -94,7 +94,7 @@ Use at least a 10 GB network, if possible. 1 Gb will also work, but it will be m

 ## Huge Pages {#huge-pages}

-If you are using old Linux kernel, disable transparent huge pages. It interferes with memory allocators, which leads to significant performance degradation.
+If you are using old Linux kernel, disable transparent huge pages. It interferes with memory allocator, which leads to significant performance degradation.
 On newer Linux kernels transparent huge pages are alright.

 ``` bash
@ -107,7 +107,7 @@ If you are using OpenStack, set
 ```
 cpu_mode=host-passthrough
 ```
-in nova.conf.
+in `nova.conf`.

 If you are using libvirt, set
 ```
@ -136,7 +136,7 @@ Do not change `minSessionTimeout` setting, large values may affect ClickHouse re

 With the default settings, ZooKeeper is a time bomb:

-> The ZooKeeper server won’t delete files from old snapshots and logs when using the default configuration (see autopurge), and this is the responsibility of the operator.
+> The ZooKeeper server won’t delete files from old snapshots and logs when using the default configuration (see `autopurge`), and this is the responsibility of the operator.

 This bomb must be defused.

@ -241,7 +241,7 @@ JAVA_OPTS="-Xms{{ '{{' }} cluster.get('xms','128M') {{ '}}' }} \
    -XX:MaxGCPauseMillis=50"
 ```

-Salt init:
+Salt initialization:

 ``` text
 description "zookeeper-{{ '{{' }} cluster['name'] {{ '}}' }} centralized coordination service"
--- a/docs/en/operations/troubleshooting.md
+++ b/docs/en/operations/troubleshooting.md
@ -3,7 +3,7 @@ sidebar_position: 46
 sidebar_label: Troubleshooting
 ---

-# Troubleshooting 
+# Troubleshooting

 -   [Installation](#troubleshooting-installation-errors)
 -   [Connecting to the server](#troubleshooting-accepts-no-connections)
@ -26,7 +26,7 @@ Possible issues:

 ### Server Is Not Running {#server-is-not-running}

-**Check if server is runnnig**
+**Check if server is running**

 Command:

--- a/docs/en/sql-reference/functions/geo/h3.md
+++ b/docs/en/sql-reference/functions/geo/h3.md
@ -4,7 +4,7 @@ sidebar_label: H3 Indexes

 # Functions for Working with H3 Indexes

-[H3](https://eng.uber.com/h3/) is a geographical indexing system where Earth’s surface divided into a grid of even hexagonal cells. This system is hierarchical, i. e. each hexagon on the top level ("parent") can be splitted into seven even but smaller ones ("children"), and so on.
+[H3](https://eng.uber.com/h3/) is a geographical indexing system where Earth’s surface divided into a grid of even hexagonal cells. This system is hierarchical, i. e. each hexagon on the top level ("parent") can be split into seven even but smaller ones ("children"), and so on.

 The level of the hierarchy is called `resolution` and can receive a value from `0` till `15`, where `0` is the `base` level with the largest and coarsest cells.

@ -1398,4 +1398,4 @@ Result:
 │ [(37.42012867767779,-122.03773496427027),(37.33755608435299,-122.090428929044)] │
 └─────────────────────────────────────────────────────────────────────────────────┘
 ```
-[Original article](https://clickhouse.com/docs/en/sql-reference/functions/geo/h3) <!--hide-->
+[Original article](https://clickhouse.com/docs/en/sql-reference/functions/geo/h3) <!--hide-->
--- a/docs/en/sql-reference/functions/type-conversion-functions.md
+++ b/docs/en/sql-reference/functions/type-conversion-functions.md
@ -32,7 +32,7 @@ Integer value in the `Int8`, `Int16`, `Int32`, `Int64`, `Int128` or `Int256` dat

 Functions use [rounding towards zero](https://en.wikipedia.org/wiki/Rounding#Rounding_towards_zero), meaning they truncate fractional digits of numbers.

-The behavior of functions for the [NaN and Inf](../../sql-reference/data-types/float.md#data_type-float-nan-inf) arguments is undefined. Remember about [numeric convertions issues](#numeric-conversion-issues), when using the functions.
+The behavior of functions for the [NaN and Inf](../../sql-reference/data-types/float.md#data_type-float-nan-inf) arguments is undefined. Remember about [numeric conversions issues](#numeric-conversion-issues), when using the functions.

 **Example**

@ -131,7 +131,7 @@ Integer value in the `UInt8`, `UInt16`, `UInt32`, `UInt64` or `UInt256` data typ

 Functions use [rounding towards zero](https://en.wikipedia.org/wiki/Rounding#Rounding_towards_zero), meaning they truncate fractional digits of numbers.

-The behavior of functions for negative agruments and for the [NaN and Inf](../../sql-reference/data-types/float.md#data_type-float-nan-inf) arguments is undefined. If you pass a string with a negative number, for example `'-32'`, ClickHouse raises an exception. Remember about [numeric convertions issues](#numeric-conversion-issues), when using the functions.
+The behavior of functions for negative agruments and for the [NaN and Inf](../../sql-reference/data-types/float.md#data_type-float-nan-inf) arguments is undefined. If you pass a string with a negative number, for example `'-32'`, ClickHouse raises an exception. Remember about [numeric conversions issues](#numeric-conversion-issues), when using the functions.

 **Example**

@ -689,7 +689,7 @@ x::t

 -    Converted value.

-:::note    
+:::note
 If the input value does not fit the bounds of the target type, the result overflows. For example, `CAST(-1, 'UInt8')` returns `255`.
 :::

@ -1433,7 +1433,7 @@ Result:

 Converts a `DateTime64` to a `Int64` value with fixed sub-second precision. Input value is scaled up or down appropriately depending on it precision.

-:::note    
+:::note
 The output value is a timestamp in UTC, not in the timezone of `DateTime64`.
 :::

--- a/utils/check-style/aspell-ignore/en/aspell-dict.txt
+++ b/utils/check-style/aspell-ignore/en/aspell-dict.txt
@ -0,0 +1,485 @@
+personal_ws-1.1 en 484
+AArch
+ACLs
+AMQP
+ASLR
+ASan
+Actian
+AddressSanitizer
+AppleClang
+ArrowStream
+AvroConfluent
+CCTOOLS
+CLion
+CMake
+CMakeLists
+CPUs
+CSVWithNames
+CSVWithNamesAndTypes
+CamelCase
+CapnProto
+CentOS
+ClickHouse
+Config
+Contrib
+Ctrl
+CustomSeparated
+CustomSeparatedWithNames
+CustomSeparatedWithNamesAndTypes
+DBMSs
+DateTime
+DockerHub
+Doxygen
+Encodings
+Enum
+Eoan
+FixedString
+FreeBSD
+Fuzzer
+Fuzzers
+GTest
+Gb
+Gcc
+GoogleTest
+HDDs
+Heredoc
+Homebrew
+Homebrew's
+Hostname
+IPv
+IntN
+Integrations
+JSONAsString
+JSONColumns
+JSONColumnsWithMetadata
+JSONCompact
+JSONCompactColumns
+JSONCompactEachRow
+JSONCompactEachRowWithNames
+JSONCompactEachRowWithNamesAndTypes
+JSONCompactStrings
+JSONCompactStringsEachRow
+JSONCompactStringsEachRowWithNames
+JSONCompactStringsEachRowWithNamesAndTypes
+JSONEachRow
+JSONEachRowWithProgress
+JSONStrings
+JSONStringsEachRow
+JSONStringsEachRowWithProgress
+JSONs
+Jaeger
+Jemalloc
+Jepsen
+KDevelop
+LGPL
+LOCALTIME
+LOCALTIMESTAMP
+LibFuzzer
+LineAsString
+LowCardinality
+MEMTABLE
+MSan
+MacOS
+Memcheck
+MemorySanitizer
+MergeTree
+MessagePack
+MiB
+MsgPack
+Multiline
+Multithreading
+MySQLDump
+NEKUDOTAYIM
+NULLIF
+NVME
+NuRaft
+Ok
+OpenSUSE
+OpenStack
+OpenTelemetry
+PAAMAYIM
+Parsers
+Postgres
+Precompiled
+PrettyCompact
+PrettyCompactMonoBlock
+PrettyCompactNoEscapes
+PrettyNoEscapes
+PrettySpace
+PrettySpaceNoEscapes
+Protobuf
+ProtobufSingle
+QTCreator
+RBAC
+RawBLOB
+RedHat
+RowBinary
+RowBinaryWithNames
+RowBinaryWithNamesAndTypes
+Runtime
+SATA
+SERIALIZABLE
+SIMD
+SMALLINT
+SQLSTATE
+SSSE
+Schemas
+Stateful
+Submodules
+Subqueries
+TSVRaw
+TSan
+TabSeparated
+TabSeparatedRaw
+TabSeparatedRawWithNames
+TabSeparatedRawWithNamesAndTypes
+TabSeparatedWithNames
+TabSeparatedWithNamesAndTypes
+TargetSpecific
+TemplateIgnoreSpaces
+Testflows
+Tgz
+Toolset
+Tradeoff
+UBSan
+UInt
+UIntN
+UPDATEs
+Uint
+Updatable
+Util
+Valgrind
+Vectorized
+VirtualBox
+Werror
+Woboq
+WriteBuffer
+WriteBuffers
+XCode
+YAML
+YYYY
+Zipkin
+ZooKeeper
+ZooKeeper's
+aarch
+allocator
+analytics
+anonymized
+ansi
+async
+autogeneration
+autostart
+avro
+avx
+aws
+backoff
+backticks
+benchmarking
+blake
+blockSize
+boolean
+bools
+boringssl
+brotli
+buildable
+camelCase
+capn
+capnproto
+cardinality
+cassandra
+cbindgen
+ccache
+cctz
+cfg
+changelog
+charset
+charsets
+checkouting
+checksummed
+checksumming
+checksums
+cityhash
+cli
+clickhouse
+clickstream
+cmake
+codebase
+codec
+comparising
+config
+configs
+contrib
+coroutines
+cpp
+cppkafka
+cpu
+crlf
+croaring
+cronjob
+csv
+csvwithnames
+csvwithnamesandtypes
+customseparated
+customseparatedwithnames
+customseparatedwithnamesandtypes
+cyrus
+datacenter
+datafiles
+dataset
+datasets
+datetime
+dbms
+ddl
+deallocation
+debian
+decompressor
+denormals
+deserialization
+deserialized
+destructor
+destructors
+dmesg
+dont
+dragonbox
+durations
+endian
+enum
+fastops
+fcoverage
+filesystem
+filesystems
+flatbuffers
+fmtlib
+formatschema
+formatter
+fuzzer
+fuzzers
+gRPC
+gcem
+github
+glibc
+googletest
+grpc
+grpcio
+gtest
+hardlinks
+hdfs
+heredoc
+heredocs
+homebrew
+http
+https
+hyperscan
+icudata
+instantiation
+integrational
+integrations
+interserver
+jdbc
+jemalloc
+json
+jsonasstring
+jsoncolumns
+jsoncolumnsmonoblock
+jsoncompact
+jsoncompactcolumns
+jsoncompacteachrow
+jsoncompacteachrowwithnames
+jsoncompacteachrowwithnamesandtypes
+jsoncompactstrings
+jsoncompactstringseachrow
+jsoncompactstringseachrowwithnames
+jsoncompactstringseachrowwithnamesandtypes
+jsoneachrow
+jsoneachrowwithprogress
+jsonstrings
+jsonstringseachrow
+jsonstringseachrowwithprogress
+kafka
+kafkacat
+konsole
+latencies
+lexicographically
+libFuzzer
+libc
+libcpuid
+libcxx
+libcxxabi
+libdivide
+libfarmhash
+libfuzzer
+libgsasl
+libhdfs
+libmetrohash
+libpq
+libpqxx
+librdkafka
+libs
+libunwind
+libuv
+libvirt
+linearizability
+linearizable
+lineasstring
+linefeeds
+linux
+llvm
+localhost
+macOS
+mariadb
+miniselect
+msgpack
+msgpk
+multiline
+multithread
+murmurhash
+mutex
+mysql
+mysqldump
+mysqljs
+noop
+nullable
+num
+obfuscator
+odbc
+ok
+openldap
+opentelemetry
+overcommit
+parallelization
+parallelize
+parallelized
+parsers
+pclmulqdq
+performant
+poco
+popcnt
+postfix
+postfixes
+postgresql
+pre
+prebuild
+prebuilt
+preemptable
+preloaded
+preprocessed
+preprocessor
+presentational
+prestable
+prettycompact
+prettycompactmonoblock
+prettycompactnoescapes
+prettynoescapes
+prettyspace
+prettyspacenoescapes
+prlimit
+prometheus
+proto
+protobuf
+protobufsingle
+psql
+ptrs
+py
+rapidjson
+rawblob
+readahead
+readline
+readme
+readonly
+rebalanced
+replxx
+repo
+representable
+requestor
+resultset
+rethrow
+risc
+ro
+rocksdb
+rowNumberInBlock
+rowbinary
+rowbinarywithnames
+rowbinarywithnamesandtypes
+rsync
+runningAccumulate
+runtime
+russian
+rw
+sasl
+schemas
+simdjson
+skippingerrors
+sparsehash
+sql
+src
+stacktraces
+statbox
+stateful
+stderr
+stdin
+stdout
+strtod
+strtoll
+strtoull
+structs
+subdirectories
+subexpressions
+submodule
+submodules
+subpattern
+subpatterns
+subqueries
+subquery
+subseconds
+substring
+subtree
+sudo
+symlink
+symlinks
+syntaxes
+systemd
+tabseparated
+tabseparatedraw
+tabseparatedrawwithnames
+tabseparatedrawwithnamesandtypes
+tabseparatedwithnames
+tabseparatedwithnamesandtypes
+tcp
+templateignorespaces
+tgz
+th
+tmp
+tokenization
+toml
+toolset
+tskv
+tsv
+tui
+turbostat
+txt
+unary
+unencrypted
+unixodbc
+url
+userspace
+utils
+uuid
+variadic
+varint
+vectorized
+wchc
+wchs
+webpage
+webserver
+wget
+whitespace
+whitespaces
+wrt
+xcode
+xml
+xz
+zLib
+zkcopy
+zlib
+znodes
+zstd
--- a/utils/check-style/check-doc-aspell
+++ b/utils/check-style/check-doc-aspell
@ -0,0 +1,49 @@
+#!/usr/bin/env bash
+
+# Perform spell checking on the docs
+
+if [[ ${1:-} == "--help" ]] || [[ ${1:-} == "-h" ]]; then
+    echo "Usage $0 [--help|-h] [-i]"
+    echo "  --help|-h: print this help"
+    echo "  -i: interactive mode"
+    exit 0
+fi
+
+ROOT_PATH=$(git rev-parse --show-toplevel)
+
+CHECK_LANG=en
+
+ASPELL_IGNORE_PATH="${ROOT_PATH}/utils/check-style/aspell-ignore/${CHECK_LANG}"
+
+STATUS=0
+for fname in ${ROOT_PATH}/docs/${CHECK_LANG}/**/*.md; do
+    if [[ ${1:-} == "-i" ]]; then
+        echo "Checking $fname"
+        aspell --personal=aspell-dict.txt --add-sgml-skip=code --encoding=utf-8 --mode=markdown -W 3 --lang=${CHECK_LANG} --home-dir=${ASPELL_IGNORE_PATH} -c "$fname"
+        continue
+    fi
+
+    errors=$(cat "$fname" \
+        | aspell list \
+            -W 3 \
+            --personal=aspell-dict.txt \
+            --add-sgml-skip=code \
+            --encoding=utf-8 \
+            --mode=markdown \
+            --lang=${CHECK_LANG} \
+            --home-dir=${ASPELL_IGNORE_PATH} \
+        | sort | uniq)
+    if [ ! -z "$errors" ]; then
+        STATUS=1
+        echo "====== $fname ======"
+        echo "$errors"
+    fi
+done
+
+if (( STATUS != 0 )); then
+    echo "====== Errors found ======"
+    echo "To exclude some words add them to the dictionary file \"${ASPELL_IGNORE_PATH}/aspell-dict.txt\""
+    echo "You can also run ${0} -i to see the errors interactively and fix them or add to the dictionary file"
+fi
+
+exit ${STATUS}
--- a/utils/check-style/check-style-all
+++ b/utils/check-style/check-style-all
@ -6,3 +6,4 @@ $dir/check-typos
 $dir/check-whitespaces -n
 $dir/check-duplicate-includes.sh
 $dir/shellcheck-run.sh
+$dir/check-doc-aspell
--- a/utils/check-style/check-typos
+++ b/utils/check-style/check-typos
@ -5,7 +5,7 @@
 ROOT_PATH=$(git rev-parse --show-toplevel)

 codespell \
-    --skip '*generated*,*gperf*,*.bin,*.mrk*,*.idx,checksums.txt,*.dat,*.pyc,*.kate-swp,*obfuscateQueries.cpp' \
+    --skip "*generated*,*gperf*,*.bin,*.mrk*,*.idx,checksums.txt,*.dat,*.pyc,*.kate-swp,*obfuscateQueries.cpp,${ROOT_PATH}/utils/check-style/aspell-ignore" \
    --ignore-words "${ROOT_PATH}/utils/check-style/codespell-ignore-words.list" \
    --exclude-file "${ROOT_PATH}/utils/check-style/codespell-ignore-lines.list" \
    --quiet-level 2 \