From 9495f0418846d66a6d7a971e69943582ed14dc51 Mon Sep 17 00:00:00 2001 From: Michael Schnerring <3743342+schnerring@users.noreply.github.com> Date: Mon, 4 Jul 2022 00:42:54 +0200 Subject: [PATCH 1/5] Revise Docker README --- docker/server/README.md | 91 ++++++++++++++++++++++------------------- 1 file changed, 49 insertions(+), 42 deletions(-) diff --git a/docker/server/README.md b/docker/server/README.md index c074a1bac00..f331b802c37 100644 --- a/docker/server/README.md +++ b/docker/server/README.md @@ -2,61 +2,64 @@ ## What is ClickHouse? -ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real time. +ClickHouse is an open-source column-oriented database management system that allows generating of analytical data reports in real-time. -ClickHouse manages extremely large volumes of data in a stable and sustainable manner. It currently powers [Yandex.Metrica](https://metrica.yandex.com/), world’s [second largest](http://w3techs.com/technologies/overview/traffic_analysis/all) web analytics platform, with over 13 trillion database records and over 20 billion events a day, generating customized reports on-the-fly, directly from non-aggregated data. This system was successfully implemented at [CERN’s LHCb experiment](https://www.yandex.com/company/press_center/press_releases/2012/2012-04-10/) to store and process metadata on 10bn events with over 1000 attributes per event registered in 2011. +ClickHouse manages extremely large volumes of data stably and sustainably. It currently powers [Yandex.Metrica](https://metrica.yandex.com/), the world’s [second-largest](http://w3techs.com/technologies/overview/traffic_analysis/all) web analytics platform, with over 13 trillion database records and over 20 billion events a day, generating customized reports on-the-fly, directly from non-aggregated data. This system was successfully implemented at [CERN’s LHCb experiment](https://www.yandex.com/company/press_center/press_releases/2012/2012-04-10/) to store and process metadata on 10bn events with over 1000 attributes per event registered in 2011. For more information and documentation see https://clickhouse.com/. ## How to use this image ### start server instance + ```bash -$ docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server +docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server ``` -By default ClickHouse will be accessible only via docker network. See the [networking section below](#networking). +By default, ClickHouse will be accessible only via the Docker network. See the [networking section below](#networking). -By default, starting above server instance will be run as default user without password. +By default, starting above server instance will be run as the `default` user without a password. ### connect to it from a native client + ```bash -$ docker run -it --rm --link some-clickhouse-server:clickhouse-server --entrypoint clickhouse-client clickhouse/clickhouse-server --host clickhouse-server +docker run -it --rm --link some-clickhouse-server:clickhouse-server --entrypoint clickhouse-client clickhouse/clickhouse-server --host clickhouse-server # OR -$ docker exec -it some-clickhouse-server clickhouse-client +docker exec -it some-clickhouse-server clickhouse-client ``` -More information about [ClickHouse client](https://clickhouse.com/docs/en/interfaces/cli/). +More information about the [ClickHouse client](https://clickhouse.com/docs/en/interfaces/cli/). ### connect to it using curl ```bash echo "SELECT 'Hello, ClickHouse!'" | docker run -i --rm --link some-clickhouse-server:clickhouse-server curlimages/curl 'http://clickhouse-server:8123/?query=' -s --data-binary @- ``` + More information about [ClickHouse HTTP Interface](https://clickhouse.com/docs/en/interfaces/http/). -### stopping / removing the containter +### stopping / removing the container ```bash -$ docker stop some-clickhouse-server -$ docker rm some-clickhouse-server +docker stop some-clickhouse-server +docker rm some-clickhouse-server ``` ### networking -You can expose you ClickHouse running in docker by [mapping particular port](https://docs.docker.com/config/containers/container-networking/) from inside container to a host ports: +You can expose your ClickHouse running in docker by [mapping a particular port](https://docs.docker.com/config/containers/container-networking/) from inside the container using host ports: ```bash -$ docker run -d -p 18123:8123 -p19000:9000 --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server -$ echo 'SELECT version()' | curl 'http://localhost:18123/' --data-binary @- +docker run -d -p 18123:8123 -p19000:9000 --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server +echo 'SELECT version()' | curl 'http://localhost:18123/' --data-binary @- 20.12.3.3 ``` -or by allowing container to use [host ports directly](https://docs.docker.com/network/host/) using `--network=host` (also allows archiving better network performance): +or by allowing the container to use [host ports directly](https://docs.docker.com/network/host/) using `--network=host` (also allows archiving better network performance): ```bash -$ docker run -d --network=host --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server -$ echo 'SELECT version()' | curl 'http://localhost:8123/' --data-binary @- +docker run -d --network=host --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server +echo 'SELECT version()' | curl 'http://localhost:8123/' --data-binary @- 20.12.3.3 ``` @@ -65,68 +68,72 @@ $ echo 'SELECT version()' | curl 'http://localhost:8123/' --data-binary @- Typically you may want to mount the following folders inside your container to archieve persistency: * `/var/lib/clickhouse/` - main folder where ClickHouse stores the data -* `/val/log/clickhouse-server/` - logs +* `/var/log/clickhouse-server/` - logs ```bash -$ docker run -d \ - -v $(realpath ./ch_data):/var/lib/clickhouse/ \ - -v $(realpath ./ch_logs):/var/log/clickhouse-server/ \ - --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server +docker run -d \ + -v $(realpath ./ch_data):/var/lib/clickhouse/ \ + -v $(realpath ./ch_logs):/var/log/clickhouse-server/ \ + --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server ``` You may also want to mount: * `/etc/clickhouse-server/config.d/*.xml` - files with server configuration adjustmenets -* `/etc/clickhouse-server/usert.d/*.xml` - files with use settings adjustmenets +* `/etc/clickhouse-server/users.d/*.xml` - files with user settings adjustmenets * `/docker-entrypoint-initdb.d/` - folder with database initialization scripts (see below). ### Linux capabilities -ClickHouse has some advanced functionality which requite enabling several [linux capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html). +ClickHouse has some advanced functionality, which requires enabling several [Linux capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html). -It is optional and can be enabled using the following [docker command line agruments](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities): +It is optional and can be enabled using the following [docker command-line arguments](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities): ```bash -$ docker run -d \ - --cap-add=SYS_NICE --cap-add=NET_ADMIN --cap-add=IPC_LOCK \ - --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server +docker run -d \ + --cap-add=SYS_NICE --cap-add=NET_ADMIN --cap-add=IPC_LOCK \ + --name some-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server ``` ## Configuration -Container exposes 8123 port for [HTTP interface](https://clickhouse.com/docs/en/interfaces/http_interface/) and 9000 port for [native client](https://clickhouse.com/docs/en/interfaces/tcp/). +The container exposes port 8123 for the [HTTP interface](https://clickhouse.com/docs/en/interfaces/http_interface/) and port 9000 for the [native client](https://clickhouse.com/docs/en/interfaces/tcp/). -ClickHouse configuration represented with a file "config.xml" ([documentation](https://clickhouse.com/docs/en/operations/configuration_files/)) +ClickHouse configuration is represented with a file "config.xml" ([documentation](https://clickhouse.com/docs/en/operations/configuration_files/)) ### Start server instance with custom configuration + ```bash -$ docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 -v /path/to/your/config.xml:/etc/clickhouse-server/config.xml clickhouse/clickhouse-server +docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 -v /path/to/your/config.xml:/etc/clickhouse-server/config.xml clickhouse/clickhouse-server ``` ### Start server as custom user -``` + +```bash # $(pwd)/data/clickhouse should exist and be owned by current user -$ docker run --rm --user ${UID}:${GID} --name some-clickhouse-server --ulimit nofile=262144:262144 -v "$(pwd)/logs/clickhouse:/var/log/clickhouse-server" -v "$(pwd)/data/clickhouse:/var/lib/clickhouse" clickhouse/clickhouse-server +docker run --rm --user ${UID}:${GID} --name some-clickhouse-server --ulimit nofile=262144:262144 -v "$(pwd)/logs/clickhouse:/var/log/clickhouse-server" -v "$(pwd)/data/clickhouse:/var/lib/clickhouse" clickhouse/clickhouse-server ``` -When you use the image with mounting local directories inside you probably would like to not mess your directory tree with files owner and permissions. Then you could use `--user` argument. In this case, you should mount every necessary directory (`/var/lib/clickhouse` and `/var/log/clickhouse-server`) inside the container. Otherwise, image will complain and not start. + +When you use the image with mounting local directories inside you probably would like to not mess your directory tree with files owner and permissions. Then you could use `--user` argument. In this case, you should mount every necessary directory (`/var/lib/clickhouse` and `/var/log/clickhouse-server`) inside the container. Otherwise, the image will complain and not start. ### Start server from root (useful in case of userns enabled) -``` -$ docker run --rm -e CLICKHOUSE_UID=0 -e CLICKHOUSE_GID=0 --name clickhouse-server-userns -v "$(pwd)/logs/clickhouse:/var/log/clickhouse-server" -v "$(pwd)/data/clickhouse:/var/lib/clickhouse" clickhouse/clickhouse-server + +```bash +docker run --rm -e CLICKHOUSE_UID=0 -e CLICKHOUSE_GID=0 --name clickhouse-server-userns -v "$(pwd)/logs/clickhouse:/var/log/clickhouse-server" -v "$(pwd)/data/clickhouse:/var/lib/clickhouse" clickhouse/clickhouse-server ``` ### How to create default database and user on starting -Sometimes you may want to create user (user named `default` is used by default) and database on image starting. You can do it using environment variables `CLICKHOUSE_DB`, `CLICKHOUSE_USER`, `CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT` and `CLICKHOUSE_PASSWORD`: +Sometimes you may want to create a user (user named `default` is used by default) and database on image start. You can do it using environment variables `CLICKHOUSE_DB`, `CLICKHOUSE_USER`, `CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT` and `CLICKHOUSE_PASSWORD`: -``` -$ docker run --rm -e CLICKHOUSE_DB=my_database -e CLICKHOUSE_USER=username -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 -e CLICKHOUSE_PASSWORD=password -p 9000:9000/tcp clickhouse/clickhouse-server +```bash +docker run --rm -e CLICKHOUSE_DB=my_database -e CLICKHOUSE_USER=username -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 -e CLICKHOUSE_PASSWORD=password -p 9000:9000/tcp clickhouse/clickhouse-server ``` ## How to extend this image If you would like to do additional initialization in an image derived from this one, add one or more `*.sql`, `*.sql.gz`, or `*.sh` scripts under `/docker-entrypoint-initdb.d`. After the entrypoint calls `initdb` it will run any `*.sql` files, run any executable `*.sh` scripts, and source any non-executable `*.sh` scripts found in that directory to do further initialization before starting the service. -Also you can provide environment variables `CLICKHOUSE_USER` & `CLICKHOUSE_PASSWORD` that will be used for clickhouse-client during initialization. +Also, you can provide environment variables `CLICKHOUSE_USER` & `CLICKHOUSE_PASSWORD` that will be used for clickhouse-client during initialization. For example, to add an additional user and database, add the following to `/docker-entrypoint-initdb.d/init-db.sh`: @@ -135,8 +142,8 @@ For example, to add an additional user and database, add the following to `/dock set -e clickhouse client -n <<-EOSQL - CREATE DATABASE docker; - CREATE TABLE docker.docker (x Int32) ENGINE = Log; + CREATE DATABASE docker; + CREATE TABLE docker.docker (x Int32) ENGINE = Log; EOSQL ``` From 52e2fa158a382670f69a3b3206e33fc1336b8f8d Mon Sep 17 00:00:00 2001 From: DanRoscigno Date: Tue, 5 Jul 2022 10:20:55 -0400 Subject: [PATCH 2/5] small edits --- docker/server/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docker/server/README.md b/docker/server/README.md index f331b802c37..66e82d0bba1 100644 --- a/docker/server/README.md +++ b/docker/server/README.md @@ -1,8 +1,8 @@ -# ClickHouse Server Docker Image +# ClickHouse Server Docker Imagex ## What is ClickHouse? -ClickHouse is an open-source column-oriented database management system that allows generating of analytical data reports in real-time. +ClickHouse is an open-source column-oriented database management system that allows the generation of analytical data reports in real-time. ClickHouse manages extremely large volumes of data stably and sustainably. It currently powers [Yandex.Metrica](https://metrica.yandex.com/), the world’s [second-largest](http://w3techs.com/technologies/overview/traffic_analysis/all) web analytics platform, with over 13 trillion database records and over 20 billion events a day, generating customized reports on-the-fly, directly from non-aggregated data. This system was successfully implemented at [CERN’s LHCb experiment](https://www.yandex.com/company/press_center/press_releases/2012/2012-04-10/) to store and process metadata on 10bn events with over 1000 attributes per event registered in 2011. @@ -107,14 +107,14 @@ ClickHouse configuration is represented with a file "config.xml" ([documentation docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 -v /path/to/your/config.xml:/etc/clickhouse-server/config.xml clickhouse/clickhouse-server ``` -### Start server as custom user +### Start server as a custom user ```bash # $(pwd)/data/clickhouse should exist and be owned by current user docker run --rm --user ${UID}:${GID} --name some-clickhouse-server --ulimit nofile=262144:262144 -v "$(pwd)/logs/clickhouse:/var/log/clickhouse-server" -v "$(pwd)/data/clickhouse:/var/lib/clickhouse" clickhouse/clickhouse-server ``` -When you use the image with mounting local directories inside you probably would like to not mess your directory tree with files owner and permissions. Then you could use `--user` argument. In this case, you should mount every necessary directory (`/var/lib/clickhouse` and `/var/log/clickhouse-server`) inside the container. Otherwise, the image will complain and not start. +When you use the image with local directories mounted, you probably would like to specify the user to maintain the proper file ownership. Use the `--user` argument and mount `/var/lib/clickhouse` and `/var/log/clickhouse-server` inside the container. Otherwise, the image will complain and not start. ### Start server from root (useful in case of userns enabled) From 09ff006ddb75d34ab99c64ff22ad5a9a64017a59 Mon Sep 17 00:00:00 2001 From: DanRoscigno Date: Tue, 5 Jul 2022 11:27:58 -0400 Subject: [PATCH 3/5] small edits --- docker/server/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docker/server/README.md b/docker/server/README.md index 66e82d0bba1..17eaa1a4d8b 100644 --- a/docker/server/README.md +++ b/docker/server/README.md @@ -1,4 +1,4 @@ -# ClickHouse Server Docker Imagex +# ClickHouse Server Docker Image ## What is ClickHouse? From 2f2fe9cffbff4f9676590630611af31c6fb1577e Mon Sep 17 00:00:00 2001 From: DanRoscigno Date: Tue, 5 Jul 2022 11:39:38 -0400 Subject: [PATCH 4/5] small edits --- docker/server/README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docker/server/README.md b/docker/server/README.md index 17eaa1a4d8b..3639a432b1e 100644 --- a/docker/server/README.md +++ b/docker/server/README.md @@ -65,7 +65,7 @@ echo 'SELECT version()' | curl 'http://localhost:8123/' --data-binary @- ### Volumes -Typically you may want to mount the following folders inside your container to archieve persistency: +Typically you may want to mount the following folders inside your container to achieve persistency: * `/var/lib/clickhouse/` - main folder where ClickHouse stores the data * `/var/log/clickhouse-server/` - logs @@ -87,7 +87,7 @@ You may also want to mount: ClickHouse has some advanced functionality, which requires enabling several [Linux capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html). -It is optional and can be enabled using the following [docker command-line arguments](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities): +These are optional and can be enabled using the following [docker command-line arguments](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities): ```bash docker run -d \ @@ -114,7 +114,7 @@ docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 -v /pa docker run --rm --user ${UID}:${GID} --name some-clickhouse-server --ulimit nofile=262144:262144 -v "$(pwd)/logs/clickhouse:/var/log/clickhouse-server" -v "$(pwd)/data/clickhouse:/var/lib/clickhouse" clickhouse/clickhouse-server ``` -When you use the image with local directories mounted, you probably would like to specify the user to maintain the proper file ownership. Use the `--user` argument and mount `/var/lib/clickhouse` and `/var/log/clickhouse-server` inside the container. Otherwise, the image will complain and not start. +When you use the image with local directories mounted, you probably want to specify the user to maintain the proper file ownership. Use the `--user` argument and mount `/var/lib/clickhouse` and `/var/log/clickhouse-server` inside the container. Otherwise, the image will complain and not start. ### Start server from root (useful in case of userns enabled) @@ -132,7 +132,7 @@ docker run --rm -e CLICKHOUSE_DB=my_database -e CLICKHOUSE_USER=username -e CLIC ## How to extend this image -If you would like to do additional initialization in an image derived from this one, add one or more `*.sql`, `*.sql.gz`, or `*.sh` scripts under `/docker-entrypoint-initdb.d`. After the entrypoint calls `initdb` it will run any `*.sql` files, run any executable `*.sh` scripts, and source any non-executable `*.sh` scripts found in that directory to do further initialization before starting the service. +To perform additional initialization in an image derived from this one, add one or more `*.sql`, `*.sql.gz`, or `*.sh` scripts under `/docker-entrypoint-initdb.d`. After the entrypoint calls `initdb`, it will run any `*.sql` files, run any executable `*.sh` scripts, and source any non-executable `*.sh` scripts found in that directory to do further initialization before starting the service. Also, you can provide environment variables `CLICKHOUSE_USER` & `CLICKHOUSE_PASSWORD` that will be used for clickhouse-client during initialization. For example, to add an additional user and database, add the following to `/docker-entrypoint-initdb.d/init-db.sh`: @@ -150,3 +150,4 @@ EOSQL ## License View [license information](https://github.com/ClickHouse/ClickHouse/blob/master/LICENSE) for the software contained in this image. + From 5f42c08561d1056d96f51f4585250d0e45a4d905 Mon Sep 17 00:00:00 2001 From: DanRoscigno Date: Tue, 5 Jul 2022 11:42:42 -0400 Subject: [PATCH 5/5] small edits --- docker/server/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docker/server/README.md b/docker/server/README.md index 3639a432b1e..2ff08620658 100644 --- a/docker/server/README.md +++ b/docker/server/README.md @@ -4,7 +4,7 @@ ClickHouse is an open-source column-oriented database management system that allows the generation of analytical data reports in real-time. -ClickHouse manages extremely large volumes of data stably and sustainably. It currently powers [Yandex.Metrica](https://metrica.yandex.com/), the world’s [second-largest](http://w3techs.com/technologies/overview/traffic_analysis/all) web analytics platform, with over 13 trillion database records and over 20 billion events a day, generating customized reports on-the-fly, directly from non-aggregated data. This system was successfully implemented at [CERN’s LHCb experiment](https://www.yandex.com/company/press_center/press_releases/2012/2012-04-10/) to store and process metadata on 10bn events with over 1000 attributes per event registered in 2011. +ClickHouse manages extremely large volumes of data. It currently powers [Yandex.Metrica](https://metrica.yandex.com/), the world’s [second-largest](http://w3techs.com/technologies/overview/traffic_analysis/all) web analytics platform, with over 13 trillion database records and over 20 billion events a day, generating customized reports on-the-fly, directly from non-aggregated data. This system was successfully implemented at [CERN’s LHCb experiment](https://www.yandex.com/company/press_center/press_releases/2012/2012-04-10/) to store and process metadata on 10bn events with over 1000 attributes per event registered in 2011. For more information and documentation see https://clickhouse.com/.