mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-25 17:12:03 +00:00
Merge branch 'master' into fix-odbc-invalid-cursor
This commit is contained in:
commit
86a0e3e332
12
.arcignore
12
.arcignore
@ -1,12 +0,0 @@
|
||||
# .arcignore is the same as .gitignore but for Arc VCS.
|
||||
# Arc VCS is a proprietary VCS in Yandex that is very similar to Git
|
||||
# from the user perspective but with the following differences:
|
||||
# 1. Data is stored in distributed object storage.
|
||||
# 2. Local copy works via FUSE without downloading all the objects.
|
||||
# For this reason, it is better suited for huge monorepositories that can be found in large companies (e.g. Yandex, Google).
|
||||
# As ClickHouse developers, we don't use Arc as a VCS (we use Git).
|
||||
# But the ClickHouse source code is also mirrored into internal monorepository and our collegues are using Arc.
|
||||
# You can read more about Arc here: https://habr.com/en/company/yandex/blog/482926/
|
||||
|
||||
# Repository is synchronized without 3rd-party submodules.
|
||||
contrib
|
2
.github/PULL_REQUEST_TEMPLATE.md
vendored
2
.github/PULL_REQUEST_TEMPLATE.md
vendored
@ -1,5 +1,3 @@
|
||||
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
|
||||
|
||||
Changelog category (leave one):
|
||||
- New Feature
|
||||
- Improvement
|
||||
|
@ -7,6 +7,7 @@
|
||||
* Under clickhouse-local, always treat local addresses with a port as remote. [#26736](https://github.com/ClickHouse/ClickHouse/pull/26736) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Fix the issue that in case of some sophisticated query with column aliases identical to the names of expressions, bad cast may happen. This fixes [#25447](https://github.com/ClickHouse/ClickHouse/issues/25447). This fixes [#26914](https://github.com/ClickHouse/ClickHouse/issues/26914). This fix may introduce backward incompatibility: if there are different expressions with identical names, exception will be thrown. It may break some rare cases when `enable_optimize_predicate_expression` is set. [#26639](https://github.com/ClickHouse/ClickHouse/pull/26639) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Now, scalar subquery always returns `Nullable` result if it's type can be `Nullable`. It is needed because in case of empty subquery it's result should be `Null`. Previously, it was possible to get error about incompatible types (type deduction does not execute scalar subquery, and it could use not-nullable type). Scalar subquery with empty result which can't be converted to `Nullable` (like `Array` or `Tuple`) now throws error. Fixes [#25411](https://github.com/ClickHouse/ClickHouse/issues/25411). [#26423](https://github.com/ClickHouse/ClickHouse/pull/26423) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Introduce syntax for here documents. Example `SELECT $doc$ VALUE $doc$`. [#26671](https://github.com/ClickHouse/ClickHouse/pull/26671) ([Maksim Kita](https://github.com/kitaisreal)). This change is backward incompatible if in query there are identifiers that contain `$` [#28768](https://github.com/ClickHouse/ClickHouse/issues/28768).
|
||||
|
||||
#### New Feature
|
||||
|
||||
@ -17,7 +18,6 @@
|
||||
* Added integration with S2 geometry library. [#24980](https://github.com/ClickHouse/ClickHouse/pull/24980) ([Andr0901](https://github.com/Andr0901)). ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Add SQLite table engine, table function, database engine. [#24194](https://github.com/ClickHouse/ClickHouse/pull/24194) ([Arslan Gumerov](https://github.com/g-arslan)). ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Added support for custom query for `MySQL`, `PostgreSQL`, `ClickHouse`, `JDBC`, `Cassandra` dictionary source. Closes [#1270](https://github.com/ClickHouse/ClickHouse/issues/1270). [#26995](https://github.com/ClickHouse/ClickHouse/pull/26995) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Introduce syntax for here documents. Example `SELECT $doc$ VALUE $doc$`. [#26671](https://github.com/ClickHouse/ClickHouse/pull/26671) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Add shared (replicated) storage of user, roles, row policies, quotas and settings profiles through ZooKeeper. [#27426](https://github.com/ClickHouse/ClickHouse/pull/27426) ([Kevin Michel](https://github.com/kmichel-aiven)).
|
||||
* Add compression for `INTO OUTFILE` that automatically choose compression algorithm. Closes [#3473](https://github.com/ClickHouse/ClickHouse/issues/3473). [#27134](https://github.com/ClickHouse/ClickHouse/pull/27134) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
|
||||
* Add `INSERT ... FROM INFILE` similarly to `SELECT ... INTO OUTFILE`. [#27655](https://github.com/ClickHouse/ClickHouse/pull/27655) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
|
||||
@ -34,7 +34,6 @@
|
||||
* New functions `currentProfiles()`, `enabledProfiles()`, `defaultProfiles()`. [#26714](https://github.com/ClickHouse/ClickHouse/pull/26714) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Add functions that return (initial_)query_id of the current query. This closes [#23682](https://github.com/ClickHouse/ClickHouse/issues/23682). [#26410](https://github.com/ClickHouse/ClickHouse/pull/26410) ([Alexey Boykov](https://github.com/mathalex)).
|
||||
* Add `REPLACE GRANT` feature. [#26384](https://github.com/ClickHouse/ClickHouse/pull/26384) ([Caspian](https://github.com/Cas-pian)).
|
||||
* Implement window function `nth_value(expr, N)` that returns the value of the Nth row of the window frame. [#26334](https://github.com/ClickHouse/ClickHouse/pull/26334) ([Zuo, RuoYu](https://github.com/ryzuo)).
|
||||
* `EXPLAIN` query now has `EXPLAIN ESTIMATE ...` mode that will show information about read rows, marks and parts from MergeTree tables. Closes [#23941](https://github.com/ClickHouse/ClickHouse/issues/23941). [#26131](https://github.com/ClickHouse/ClickHouse/pull/26131) ([fastio](https://github.com/fastio)).
|
||||
* Added `system.zookeeper_log` table. All actions of ZooKeeper client are logged into this table. Implements [#25449](https://github.com/ClickHouse/ClickHouse/issues/25449). [#26129](https://github.com/ClickHouse/ClickHouse/pull/26129) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Zero-copy replication for `ReplicatedMergeTree` over `HDFS` storage. [#25918](https://github.com/ClickHouse/ClickHouse/pull/25918) ([Zhichang Yu](https://github.com/yuzhichang)).
|
||||
|
@ -1,4 +1,4 @@
|
||||
cmake_minimum_required(VERSION 3.3)
|
||||
cmake_minimum_required(VERSION 3.14)
|
||||
|
||||
foreach(policy
|
||||
CMP0023
|
||||
|
@ -6,38 +6,6 @@ Thank you.
|
||||
|
||||
## Technical Info
|
||||
|
||||
We have a [developer's guide](https://clickhouse.yandex/docs/en/development/developer_instruction/) for writing code for ClickHouse. Besides this guide, you can find [Overview of ClickHouse Architecture](https://clickhouse.yandex/docs/en/development/architecture/) and instructions on how to build ClickHouse in different environments.
|
||||
We have a [developer's guide](https://clickhouse.com/docs/en/development/developer_instruction/) for writing code for ClickHouse. Besides this guide, you can find [Overview of ClickHouse Architecture](https://clickhouse.com/docs/en/development/architecture/) and instructions on how to build ClickHouse in different environments.
|
||||
|
||||
If you want to contribute to documentation, read the [Contributing to ClickHouse Documentation](docs/README.md) guide.
|
||||
|
||||
## Legal Info
|
||||
|
||||
In order for us (YANDEX LLC) to accept patches and other contributions from you, you may adopt our Yandex Contributor License Agreement (the "**CLA**"). The current version of the CLA you may find here:
|
||||
1) https://yandex.ru/legal/cla/?lang=en (in English) and
|
||||
2) https://yandex.ru/legal/cla/?lang=ru (in Russian).
|
||||
|
||||
By adopting the CLA, you state the following:
|
||||
|
||||
* You obviously wish and are willingly licensing your contributions to us for our open source projects under the terms of the CLA,
|
||||
* You have read the terms and conditions of the CLA and agree with them in full,
|
||||
* You are legally able to provide and license your contributions as stated,
|
||||
* We may use your contributions for our open source projects and for any other our project too,
|
||||
* We rely on your assurances concerning the rights of third parties in relation to your contributions.
|
||||
|
||||
If you agree with these principles, please read and adopt our CLA. By providing us your contributions, you hereby declare that you have already read and adopt our CLA, and we may freely merge your contributions with our corresponding open source project and use it in further in accordance with terms and conditions of the CLA.
|
||||
|
||||
If you have already adopted terms and conditions of the CLA, you are able to provide your contributes. When you submit your pull request, please add the following information into it:
|
||||
|
||||
```
|
||||
I hereby agree to the terms of the CLA available at: [link].
|
||||
```
|
||||
|
||||
Replace the bracketed text as follows:
|
||||
* [link] is the link at the current version of the CLA (you may add here a link https://yandex.ru/legal/cla/?lang=en (in English) or a link https://yandex.ru/legal/cla/?lang=ru (in Russian).
|
||||
|
||||
It is enough to provide us such notification once.
|
||||
|
||||
As an alternative, you can provide DCO instead of CLA. You can find the text of DCO here: https://developercertificate.org/
|
||||
It is enough to read and copy it verbatim to your pull request.
|
||||
|
||||
If you don't agree with the CLA and don't want to provide DCO, you still can open a pull request to provide your contributions.
|
||||
|
@ -1,63 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL(
|
||||
GLOBAL clickhouse/base
|
||||
)
|
||||
|
||||
CFLAGS (GLOBAL -DARCADIA_BUILD)
|
||||
|
||||
CFLAGS (GLOBAL -DUSE_CPUID=1)
|
||||
CFLAGS (GLOBAL -DUSE_JEMALLOC=0)
|
||||
CFLAGS (GLOBAL -DUSE_RAPIDJSON=1)
|
||||
CFLAGS (GLOBAL -DUSE_SSL=1)
|
||||
|
||||
IF (OS_DARWIN)
|
||||
CFLAGS (GLOBAL -DOS_DARWIN)
|
||||
ELSEIF (OS_FREEBSD)
|
||||
CFLAGS (GLOBAL -DOS_FREEBSD)
|
||||
ELSEIF (OS_LINUX)
|
||||
CFLAGS (GLOBAL -DOS_LINUX)
|
||||
ENDIF ()
|
||||
|
||||
PEERDIR(
|
||||
contrib/libs/cctz
|
||||
contrib/libs/cxxsupp/libcxx-filesystem
|
||||
contrib/libs/poco/Net
|
||||
contrib/libs/poco/Util
|
||||
contrib/libs/poco/NetSSL_OpenSSL
|
||||
contrib/libs/fmt
|
||||
contrib/restricted/boost
|
||||
contrib/restricted/cityhash-1.0.2
|
||||
)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
DateLUT.cpp
|
||||
DateLUTImpl.cpp
|
||||
JSON.cpp
|
||||
LineReader.cpp
|
||||
StringRef.cpp
|
||||
argsToConfig.cpp
|
||||
coverage.cpp
|
||||
demangle.cpp
|
||||
errnoToString.cpp
|
||||
getFQDNOrHostName.cpp
|
||||
getMemoryAmount.cpp
|
||||
getPageSize.cpp
|
||||
getResource.cpp
|
||||
getThreadId.cpp
|
||||
mremap.cpp
|
||||
phdr_cache.cpp
|
||||
preciseExp10.cpp
|
||||
setTerminalEcho.cpp
|
||||
shift10.cpp
|
||||
sleep.cpp
|
||||
terminalColors.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,41 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL(
|
||||
GLOBAL clickhouse/base
|
||||
)
|
||||
|
||||
CFLAGS (GLOBAL -DARCADIA_BUILD)
|
||||
|
||||
CFLAGS (GLOBAL -DUSE_CPUID=1)
|
||||
CFLAGS (GLOBAL -DUSE_JEMALLOC=0)
|
||||
CFLAGS (GLOBAL -DUSE_RAPIDJSON=1)
|
||||
CFLAGS (GLOBAL -DUSE_SSL=1)
|
||||
|
||||
IF (OS_DARWIN)
|
||||
CFLAGS (GLOBAL -DOS_DARWIN)
|
||||
ELSEIF (OS_FREEBSD)
|
||||
CFLAGS (GLOBAL -DOS_FREEBSD)
|
||||
ELSEIF (OS_LINUX)
|
||||
CFLAGS (GLOBAL -DOS_LINUX)
|
||||
ENDIF ()
|
||||
|
||||
PEERDIR(
|
||||
contrib/libs/cctz
|
||||
contrib/libs/cxxsupp/libcxx-filesystem
|
||||
contrib/libs/poco/Net
|
||||
contrib/libs/poco/Util
|
||||
contrib/libs/poco/NetSSL_OpenSSL
|
||||
contrib/libs/fmt
|
||||
contrib/restricted/boost
|
||||
contrib/restricted/cityhash-1.0.2
|
||||
)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests/ | grep -v -F examples | grep -v -F Replxx | grep -v -F Readline | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -1,19 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
NO_COMPILER_WARNINGS()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
BaseDaemon.cpp
|
||||
GraphiteWriter.cpp
|
||||
SentryWriter.cpp
|
||||
)
|
||||
|
||||
END()
|
@ -1,19 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
ExtendedLogChannel.cpp
|
||||
Loggers.cpp
|
||||
OwnFormattingChannel.cpp
|
||||
OwnPatternFormatter.cpp
|
||||
OwnSplitChannel.cpp
|
||||
)
|
||||
|
||||
END()
|
@ -1,39 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
LIBRARY()
|
||||
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
PEERDIR(
|
||||
contrib/restricted/boost/libs
|
||||
contrib/libs/libmysql_r
|
||||
contrib/libs/poco/Foundation
|
||||
contrib/libs/poco/Util
|
||||
)
|
||||
|
||||
ADDINCL(
|
||||
GLOBAL clickhouse/base
|
||||
clickhouse/base
|
||||
contrib/libs/libmysql_r
|
||||
)
|
||||
|
||||
NO_COMPILER_WARNINGS()
|
||||
|
||||
NO_UTIL()
|
||||
|
||||
SRCS(
|
||||
Connection.cpp
|
||||
Exception.cpp
|
||||
Pool.cpp
|
||||
PoolFactory.cpp
|
||||
PoolWithFailover.cpp
|
||||
Query.cpp
|
||||
ResultBase.cpp
|
||||
Row.cpp
|
||||
UseQueryResult.cpp
|
||||
Value.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,28 +0,0 @@
|
||||
LIBRARY()
|
||||
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
PEERDIR(
|
||||
contrib/restricted/boost/libs
|
||||
contrib/libs/libmysql_r
|
||||
contrib/libs/poco/Foundation
|
||||
contrib/libs/poco/Util
|
||||
)
|
||||
|
||||
ADDINCL(
|
||||
GLOBAL clickhouse/base
|
||||
clickhouse/base
|
||||
contrib/libs/libmysql_r
|
||||
)
|
||||
|
||||
NO_COMPILER_WARNINGS()
|
||||
|
||||
NO_UTIL()
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests/ | grep -v -F examples | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -1,7 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL (GLOBAL clickhouse/base/pcg-random)
|
||||
|
||||
END()
|
@ -1,11 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
readpassphrase.c
|
||||
)
|
||||
|
||||
END()
|
@ -1,13 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL(GLOBAL clickhouse/base/widechar_width)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
widechar_width.cpp
|
||||
)
|
||||
|
||||
END()
|
11
base/ya.make
11
base/ya.make
@ -1,11 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
RECURSE(
|
||||
common
|
||||
daemon
|
||||
loggers
|
||||
mysqlxx
|
||||
pcg-random
|
||||
widechar_width
|
||||
readpassphrase
|
||||
)
|
@ -1,25 +0,0 @@
|
||||
INCLUDE(${ARCADIA_ROOT}/clickhouse/cmake/autogenerated_versions.txt)
|
||||
|
||||
# TODO: not sure if this is customizable per-binary
|
||||
SET(VERSION_NAME "ClickHouse")
|
||||
|
||||
# TODO: not quite sure how to replace dash with space in ya.make
|
||||
SET(VERSION_FULL "${VERSION_NAME}-${VERSION_STRING}")
|
||||
|
||||
CFLAGS (GLOBAL -DDBMS_NAME=\"ClickHouse\")
|
||||
CFLAGS (GLOBAL -DDBMS_VERSION_MAJOR=${VERSION_MAJOR})
|
||||
CFLAGS (GLOBAL -DDBMS_VERSION_MINOR=${VERSION_MINOR})
|
||||
CFLAGS (GLOBAL -DDBMS_VERSION_PATCH=${VERSION_PATCH})
|
||||
CFLAGS (GLOBAL -DVERSION_FULL=\"\\\"${VERSION_FULL}\\\"\")
|
||||
CFLAGS (GLOBAL -DVERSION_MAJOR=${VERSION_MAJOR})
|
||||
CFLAGS (GLOBAL -DVERSION_MINOR=${VERSION_MINOR})
|
||||
CFLAGS (GLOBAL -DVERSION_PATCH=${VERSION_PATCH})
|
||||
|
||||
# TODO: not supported yet, not sure if ya.make supports arithmetic.
|
||||
CFLAGS (GLOBAL -DVERSION_INTEGER=0)
|
||||
|
||||
CFLAGS (GLOBAL -DVERSION_NAME=\"\\\"${VERSION_NAME}\\\"\")
|
||||
CFLAGS (GLOBAL -DVERSION_OFFICIAL=\"-arcadia\")
|
||||
CFLAGS (GLOBAL -DVERSION_REVISION=${VERSION_REVISION})
|
||||
CFLAGS (GLOBAL -DVERSION_STRING=\"\\\"${VERSION_STRING}\\\"\")
|
||||
|
2
contrib/rocksdb
vendored
2
contrib/rocksdb
vendored
@ -1 +1 @@
|
||||
Subproject commit 5ea892c8673e6c5a052887653673b967d44cc59b
|
||||
Subproject commit 296c1b8b95fd448b8097a1b2cc9f704ff4a73a2c
|
1
debian/control
vendored
1
debian/control
vendored
@ -7,6 +7,7 @@ Build-Depends: debhelper (>= 9),
|
||||
ninja-build,
|
||||
clang-13,
|
||||
llvm-13,
|
||||
lld-13,
|
||||
libc6-dev,
|
||||
tzdata
|
||||
Standards-Version: 3.9.8
|
||||
|
@ -92,7 +92,7 @@ if __name__ == "__main__":
|
||||
logging.info("Some exception occured %s", str(ex))
|
||||
raise
|
||||
finally:
|
||||
logging.info("Will remove dowloaded file %s from filesystem if it exists", temp_archive_path)
|
||||
logging.info("Will remove downloaded file %s from filesystem if it exists", temp_archive_path)
|
||||
if os.path.exists(temp_archive_path):
|
||||
os.remove(temp_archive_path)
|
||||
logging.info("Processing of %s finished", dataset)
|
||||
|
@ -92,7 +92,7 @@ if __name__ == "__main__":
|
||||
logging.info("Some exception occured %s", str(ex))
|
||||
raise
|
||||
finally:
|
||||
logging.info("Will remove dowloaded file %s from filesystem if it exists", temp_archive_path)
|
||||
logging.info("Will remove downloaded file %s from filesystem if it exists", temp_archive_path)
|
||||
if os.path.exists(temp_archive_path):
|
||||
os.remove(temp_archive_path)
|
||||
logging.info("Processing of %s finished", dataset)
|
||||
|
@ -1,7 +1,7 @@
|
||||
sudo apt-get install apt-transport-https ca-certificates dirmngr
|
||||
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4
|
||||
|
||||
echo "deb https://repo.clickhouse.tech/deb/stable/ main/" | sudo tee \
|
||||
echo "deb https://repo.clickhouse.com/deb/stable/ main/" | sudo tee \
|
||||
/etc/apt/sources.list.d/clickhouse.list
|
||||
sudo apt-get update
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
sudo yum install yum-utils
|
||||
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/clickhouse.repo
|
||||
sudo rpm --import https://repo.clickhouse.com/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.com/rpm/clickhouse.repo
|
||||
sudo yum install clickhouse-server clickhouse-client
|
||||
|
||||
sudo /etc/init.d/clickhouse-server start
|
||||
|
@ -1,9 +1,9 @@
|
||||
export LATEST_VERSION=$(curl -s https://repo.clickhouse.tech/tgz/stable/ | \
|
||||
export LATEST_VERSION=$(curl -s https://repo.clickhouse.com/tgz/stable/ | \
|
||||
grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
|
||||
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-client-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/stable/clickhouse-client-$LATEST_VERSION.tgz
|
||||
|
||||
tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh
|
||||
|
@ -13,16 +13,16 @@ The list of documented datasets:
|
||||
- [GitHub Events](../../getting-started/example-datasets/github-events.md)
|
||||
- [Anonymized Yandex.Metrica Dataset](../../getting-started/example-datasets/metrica.md)
|
||||
- [Recipes](../../getting-started/example-datasets/recipes.md)
|
||||
- [OnTime](../../getting-started/example-datasets/ontime.md)
|
||||
- [OpenSky](../../getting-started/example-datasets/opensky.md)
|
||||
- [New York Taxi Data](../../getting-started/example-datasets/nyc-taxi.md)
|
||||
- [UK Property Price Paid](../../getting-started/example-datasets/uk-price-paid.md)
|
||||
- [What's on the Menu?](../../getting-started/example-datasets/menus.md)
|
||||
- [Star Schema Benchmark](../../getting-started/example-datasets/star-schema.md)
|
||||
- [WikiStat](../../getting-started/example-datasets/wikistat.md)
|
||||
- [Terabyte of Click Logs from Criteo](../../getting-started/example-datasets/criteo.md)
|
||||
- [AMPLab Big Data Benchmark](../../getting-started/example-datasets/amplab-benchmark.md)
|
||||
- [Brown University Benchmark](../../getting-started/example-datasets/brown-benchmark.md)
|
||||
- [New York Taxi Data](../../getting-started/example-datasets/nyc-taxi.md)
|
||||
- [OpenSky](../../getting-started/example-datasets/opensky.md)
|
||||
- [UK Property Price Paid](../../getting-started/example-datasets/uk-price-paid.md)
|
||||
- [Cell Towers](../../getting-started/example-datasets/cell-towers.md)
|
||||
- [What's on the Menu?](../../getting-started/example-datasets/menus.md)
|
||||
- [OnTime](../../getting-started/example-datasets/ontime.md)
|
||||
|
||||
[Original article](https://clickhouse.com/docs/en/getting_started/example_datasets) <!--hide-->
|
||||
|
@ -3,7 +3,7 @@ toc_priority: 21
|
||||
toc_title: Menus
|
||||
---
|
||||
|
||||
# New York Public Library "What's on the Menu?" Dataset
|
||||
# New York Public Library "What's on the Menu?" Dataset {#menus-dataset}
|
||||
|
||||
The dataset is created by the New York Public Library. It contains historical data on the menus of hotels, restaurants and cafes with the dishes along with their prices.
|
||||
|
||||
@ -11,34 +11,38 @@ Source: http://menus.nypl.org/data
|
||||
The data is in public domain.
|
||||
|
||||
The data is from library's archive and it may be incomplete and difficult for statistical analysis. Nevertheless it is also very yummy.
|
||||
The size is just 1.3 million records about dishes in the menus (a very small data volume for ClickHouse, but it's still a good example).
|
||||
The size is just 1.3 million records about dishes in the menus — it's a very small data volume for ClickHouse, but it's still a good example.
|
||||
|
||||
## Download the Dataset
|
||||
## Download the Dataset {#download-dataset}
|
||||
|
||||
```
|
||||
Run the command:
|
||||
|
||||
```bash
|
||||
wget https://s3.amazonaws.com/menusdata.nypl.org/gzips/2021_08_01_07_01_17_data.tgz
|
||||
```
|
||||
|
||||
Replace the link to the up to date link from http://menus.nypl.org/data if needed.
|
||||
Download size is about 35 MB.
|
||||
|
||||
## Unpack the Dataset
|
||||
## Unpack the Dataset {#unpack-dataset}
|
||||
|
||||
```
|
||||
```bash
|
||||
tar xvf 2021_08_01_07_01_17_data.tgz
|
||||
```
|
||||
|
||||
Uncompressed size is about 150 MB.
|
||||
|
||||
The data is normalized consisted of four tables:
|
||||
- Menu: information about menus: the name of the restaurant, the date when menu was seen, etc;
|
||||
- Dish: information about dishes: the name of the dish along with some characteristic;
|
||||
- MenuPage: information about the pages in the menus; every page belongs to some menu;
|
||||
- MenuItem: an item of the menu - a dish along with its price on some menu page: links to dish and menu page.
|
||||
- `Menu` — Information about menus: the name of the restaurant, the date when menu was seen, etc.
|
||||
- `Dish` — Information about dishes: the name of the dish along with some characteristic.
|
||||
- `MenuPage` — Information about the pages in the menus, because every page belongs to some menu.
|
||||
- `MenuItem` — An item of the menu. A dish along with its price on some menu page: links to dish and menu page.
|
||||
|
||||
## Create the Tables
|
||||
## Create the Tables {#create-tables}
|
||||
|
||||
```
|
||||
We use [Decimal](../../sql-reference/data-types/decimal.md) data type to store prices.
|
||||
|
||||
```sql
|
||||
CREATE TABLE dish
|
||||
(
|
||||
id UInt32,
|
||||
@ -101,35 +105,33 @@ CREATE TABLE menu_item
|
||||
) ENGINE = MergeTree ORDER BY id;
|
||||
```
|
||||
|
||||
We use `Decimal` data type to store prices. Everything else is quite straightforward.
|
||||
## Import the Data {#import-data}
|
||||
|
||||
## Import Data
|
||||
Upload data into ClickHouse, run:
|
||||
|
||||
Upload data into ClickHouse:
|
||||
|
||||
```
|
||||
```bash
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO dish FORMAT CSVWithNames" < Dish.csv
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO menu FORMAT CSVWithNames" < Menu.csv
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO menu_page FORMAT CSVWithNames" < MenuPage.csv
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --date_time_input_format best_effort --query "INSERT INTO menu_item FORMAT CSVWithNames" < MenuItem.csv
|
||||
```
|
||||
|
||||
We use `CSVWithNames` format as the data is represented by CSV with header.
|
||||
We use [CSVWithNames](../../interfaces/formats.md#csvwithnames) format as the data is represented by CSV with header.
|
||||
|
||||
We disable `format_csv_allow_single_quotes` as only double quotes are used for data fields and single quotes can be inside the values and should not confuse the CSV parser.
|
||||
|
||||
We disable `input_format_null_as_default` as our data does not have NULLs. Otherwise ClickHouse will try to parse `\N` sequences and can be confused with `\` in data.
|
||||
We disable [input_format_null_as_default](../../operations/settings/settings.md#settings-input-format-null-as-default) as our data does not have [NULL](../../sql-reference/syntax.md#null-literal). Otherwise ClickHouse will try to parse `\N` sequences and can be confused with `\` in data.
|
||||
|
||||
The setting `--date_time_input_format best_effort` allows to parse `DateTime` fields in wide variety of formats. For example, ISO-8601 without seconds like '2000-01-01 01:02' will be recognized. Without this setting only fixed DateTime format is allowed.
|
||||
The setting [date_time_input_format best_effort](../../operations/settings/settings.md#settings-date_time_input_format) allows to parse [DateTime](../../sql-reference/data-types/datetime.md) fields in wide variety of formats. For example, ISO-8601 without seconds like '2000-01-01 01:02' will be recognized. Without this setting only fixed DateTime format is allowed.
|
||||
|
||||
## Denormalize the Data
|
||||
## Denormalize the Data {#denormalize-data}
|
||||
|
||||
Data is presented in multiple tables in normalized form. It means you have to perform JOINs if you want to query, e.g. dish names from menu items.
|
||||
For typical analytical tasks it is way more efficient to deal with pre-JOINed data to avoid doing JOIN every time. It is called "denormalized" data.
|
||||
Data is presented in multiple tables in [normalized form](https://en.wikipedia.org/wiki/Database_normalization#Normal_forms). It means you have to perform [JOIN](../../sql-reference/statements/select/join.md#select-join) if you want to query, e.g. dish names from menu items.
|
||||
For typical analytical tasks it is way more efficient to deal with pre-JOINed data to avoid doing `JOIN` every time. It is called "denormalized" data.
|
||||
|
||||
We will create a table that will contain all the data JOINed together:
|
||||
We will create a table `menu_item_denorm` where will contain all the data JOINed together:
|
||||
|
||||
```
|
||||
```sql
|
||||
CREATE TABLE menu_item_denorm
|
||||
ENGINE = MergeTree ORDER BY (dish_name, created_at)
|
||||
AS SELECT
|
||||
@ -171,21 +173,32 @@ AS SELECT
|
||||
FROM menu_item
|
||||
JOIN dish ON menu_item.dish_id = dish.id
|
||||
JOIN menu_page ON menu_item.menu_page_id = menu_page.id
|
||||
JOIN menu ON menu_page.menu_id = menu.id
|
||||
JOIN menu ON menu_page.menu_id = menu.id;
|
||||
```
|
||||
|
||||
## Validate the Data
|
||||
## Validate the Data {#validate-data}
|
||||
|
||||
```
|
||||
SELECT count() FROM menu_item_denorm
|
||||
1329175
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT count() FROM menu_item_denorm;
|
||||
```
|
||||
|
||||
## Run Some Queries
|
||||
|
||||
Averaged historical prices of dishes:
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─count()─┐
|
||||
│ 1329175 │
|
||||
└─────────┘
|
||||
```
|
||||
|
||||
## Run Some Queries {#run-queries}
|
||||
|
||||
### Averaged historical prices of dishes {#query-averaged-historical-prices}
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
@ -194,8 +207,12 @@ SELECT
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency = 'Dollars') AND (d > 0) AND (d < 2022)
|
||||
GROUP BY d
|
||||
ORDER BY d ASC
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 100, 100)─┐
|
||||
│ 1850 │ 618 │ 1.5 │ █▍ │
|
||||
│ 1860 │ 1634 │ 1.29 │ █▎ │
|
||||
@ -215,15 +232,15 @@ ORDER BY d ASC
|
||||
│ 2000 │ 2467 │ 11.85 │ ███████████▋ │
|
||||
│ 2010 │ 597 │ 25.66 │ █████████████████████████▋ │
|
||||
└──────┴─────────┴──────────────────────┴──────────────────────────────┘
|
||||
|
||||
17 rows in set. Elapsed: 0.044 sec. Processed 1.33 million rows, 54.62 MB (30.00 million rows/s., 1.23 GB/s.)
|
||||
```
|
||||
|
||||
Take it with a grain of salt.
|
||||
|
||||
### Burger Prices:
|
||||
### Burger Prices {#query-burger-prices}
|
||||
|
||||
```
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
@ -232,8 +249,12 @@ SELECT
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency = 'Dollars') AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%burger%')
|
||||
GROUP BY d
|
||||
ORDER BY d ASC
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)───────────┐
|
||||
│ 1880 │ 2 │ 0.42 │ ▋ │
|
||||
│ 1890 │ 7 │ 0.85 │ █▋ │
|
||||
@ -250,13 +271,13 @@ ORDER BY d ASC
|
||||
│ 2000 │ 21 │ 7.14 │ ██████████████▎ │
|
||||
│ 2010 │ 6 │ 18.42 │ ████████████████████████████████████▋ │
|
||||
└──────┴─────────┴──────────────────────┴───────────────────────────────────────┘
|
||||
|
||||
14 rows in set. Elapsed: 0.052 sec. Processed 1.33 million rows, 94.15 MB (25.48 million rows/s., 1.80 GB/s.)
|
||||
```
|
||||
|
||||
### Vodka:
|
||||
### Vodka {#query-vodka}
|
||||
|
||||
```
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
@ -265,8 +286,12 @@ SELECT
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency IN ('Dollars', '')) AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%vodka%')
|
||||
GROUP BY d
|
||||
ORDER BY d ASC
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)─┐
|
||||
│ 1910 │ 2 │ 0 │ │
|
||||
│ 1920 │ 1 │ 0.3 │ ▌ │
|
||||
@ -282,11 +307,13 @@ ORDER BY d ASC
|
||||
|
||||
To get vodka we have to write `ILIKE '%vodka%'` and this definitely makes a statement.
|
||||
|
||||
### Caviar:
|
||||
### Caviar {#query-caviar}
|
||||
|
||||
Let's print caviar prices. Also let's print a name of any dish with caviar.
|
||||
|
||||
```
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
@ -296,8 +323,12 @@ SELECT
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency IN ('Dollars', '')) AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%caviar%')
|
||||
GROUP BY d
|
||||
ORDER BY d ASC
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)──────┬─any(dish_name)──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ 1090 │ 1 │ 0 │ │ Caviar │
|
||||
│ 1880 │ 3 │ 0 │ │ Caviar │
|
||||
@ -319,6 +350,6 @@ ORDER BY d ASC
|
||||
|
||||
At least they have caviar with vodka. Very nice.
|
||||
|
||||
### Test it in Playground
|
||||
## Online Playground {#playground}
|
||||
|
||||
The data is uploaded to ClickHouse Playground, [example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUCiAgICByb3VuZCh0b1VJbnQzMk9yWmVybyhleHRyYWN0KG1lbnVfZGF0ZSwgJ15cXGR7NH0nKSksIC0xKSBBUyBkLAogICAgY291bnQoKSwKICAgIHJvdW5kKGF2ZyhwcmljZSksIDIpLAogICAgYmFyKGF2ZyhwcmljZSksIDAsIDUwLCAxMDApLAogICAgYW55KGRpc2hfbmFtZSkKRlJPTSBtZW51X2l0ZW1fZGVub3JtCldIRVJFIChtZW51X2N1cnJlbmN5IElOICgnRG9sbGFycycsICcnKSkgQU5EIChkID4gMCkgQU5EIChkIDwgMjAyMikgQU5EIChkaXNoX25hbWUgSUxJS0UgJyVjYXZpYXIlJykKR1JPVVAgQlkgZApPUkRFUiBCWSBkIEFTQw==).
|
||||
|
@ -42,7 +42,11 @@ md5sum hits_v1.tsv
|
||||
# Checksum should be equal to: f3631b6295bf06989c1437491f7592cb
|
||||
# now create table
|
||||
clickhouse-client --query "CREATE DATABASE IF NOT EXISTS datasets"
|
||||
# for hits_v1
|
||||
clickhouse-client --query "CREATE TABLE datasets.hits_v1 ( WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime, EventDate Date, CounterID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RegionID UInt32, UserID UInt64, CounterClass Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, URLDomain String, RefererDomain String, Refresh UInt8, IsRobot UInt8, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, FlashMinor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth UInt16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightVersion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt8, IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8, DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), UTCEventTime DateTime, Age UInt8, Sex UInt8, Income UInt8, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), RemoteIP UInt32, RemoteIP6 FixedString(16), WindowName Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNetwork String, SocialAction String, HTTPError UInt16, SendTiming Int32, DNSTiming Int32, ConnectTiming Int32, ResponseStartTiming Int32, ResponseEndTiming Int32, FetchTiming Int32, RedirectTiming Int32, DOMInteractiveTiming Int32, DOMContentLoadedTiming Int32, DOMCompleteTiming Int32, LoadEventStartTiming Int32, LoadEventEndTiming Int32, NSToDOMContentLoadedTiming Int32, FirstPaintTiming Int32, RedirectCount Int8, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64, ParamOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, GoalsReached Array(UInt32), OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64, CLID UInt32, YCLID UInt64, ShareService String, ShareURL String, ShareTitle String, ParsedParams Nested(Key1 String, Key2 String, Key3 String, Key4 String, Key5 String, ValueDouble Float64), IslandID FixedString(16), RequestNum UInt32, RequestTry UInt8) ENGINE = MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity = 8192"
|
||||
# for hits_100m_obfuscated
|
||||
clickhouse-client --query="CREATE TABLE hits_100m_obfuscated (WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime, EventDate Date, CounterID UInt32, ClientIP UInt32, RegionID UInt32, UserID UInt64, CounterClass Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, Refresh UInt8, RefererCategoryID UInt16, RefererRegionID UInt32, URLCategoryID UInt16, URLRegionID UInt32, ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, FlashMinor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth UInt16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightVersion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt8, IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, OriginalURL String, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8, DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), LocalEventTime DateTime, Age UInt8, Sex UInt8, Income UInt8, Interests UInt16, Robotness UInt8, RemoteIP UInt32, WindowName Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNetwork String, SocialAction String, HTTPError UInt16, SendTiming UInt32, DNSTiming UInt32, ConnectTiming UInt32, ResponseStartTiming UInt32, ResponseEndTiming UInt32, FetchTiming UInt32, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64, ParamOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64, CLID UInt32) ENGINE = MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity = 8192"
|
||||
|
||||
# import data
|
||||
cat hits_v1.tsv | clickhouse-client --query "INSERT INTO datasets.hits_v1 FORMAT TSV" --max_insert_block_size=100000
|
||||
# optionally you can optimize table
|
||||
|
@ -3,7 +3,7 @@ toc_priority: 20
|
||||
toc_title: OpenSky
|
||||
---
|
||||
|
||||
# Crowdsourced air traffic data from The OpenSky Network 2020
|
||||
# Crowdsourced air traffic data from The OpenSky Network 2020 {#opensky}
|
||||
|
||||
"The data in this dataset is derived and cleaned from the full OpenSky dataset to illustrate the development of air traffic during the COVID-19 pandemic. It spans all flights seen by the network's more than 2500 members since 1 January 2019. More data will be periodically included in the dataset until the end of the COVID-19 pandemic".
|
||||
|
||||
@ -14,17 +14,19 @@ Martin Strohmeier, Xavier Olive, Jannis Lübbe, Matthias Schäfer, and Vincent L
|
||||
Earth System Science Data 13(2), 2021
|
||||
https://doi.org/10.5194/essd-13-357-2021
|
||||
|
||||
## Download the Dataset
|
||||
## Download the Dataset {#download-dataset}
|
||||
|
||||
```
|
||||
Run the command:
|
||||
|
||||
```bash
|
||||
wget -O- https://zenodo.org/record/5092942 | grep -oP 'https://zenodo.org/record/5092942/files/flightlist_\d+_\d+\.csv\.gz' | xargs wget
|
||||
```
|
||||
|
||||
Download will take about 2 minutes with good internet connection. There are 30 files with total size of 4.3 GB.
|
||||
|
||||
## Create the Table
|
||||
## Create the Table {#create-table}
|
||||
|
||||
```
|
||||
```sql
|
||||
CREATE TABLE opensky
|
||||
(
|
||||
callsign String,
|
||||
@ -46,69 +48,101 @@ CREATE TABLE opensky
|
||||
) ENGINE = MergeTree ORDER BY (origin, destination, callsign);
|
||||
```
|
||||
|
||||
## Import Data
|
||||
## Import Data {#import-data}
|
||||
|
||||
Upload data into ClickHouse in parallel:
|
||||
|
||||
```
|
||||
ls -1 flightlist_*.csv.gz | xargs -P100 -I{} bash -c '
|
||||
gzip -c -d "{}" | clickhouse-client --date_time_input_format best_effort --query "INSERT INTO opensky FORMAT CSVWithNames"'
|
||||
```bash
|
||||
ls -1 flightlist_*.csv.gz | xargs -P100 -I{} bash -c 'gzip -c -d "{}" | clickhouse-client --date_time_input_format best_effort --query "INSERT INTO opensky FORMAT CSVWithNames"'
|
||||
```
|
||||
|
||||
Here we pass the list of files (`ls -1 flightlist_*.csv.gz`) to `xargs` for parallel processing.
|
||||
- Here we pass the list of files (`ls -1 flightlist_*.csv.gz`) to `xargs` for parallel processing.
|
||||
`xargs -P100` specifies to use up to 100 parallel workers but as we only have 30 files, the number of workers will be only 30.
|
||||
- For every file, `xargs` will run a script with `bash -c`. The script has substitution in form of `{}` and the `xargs` command will substitute the filename to it (we have asked it for `xargs` with `-I{}`).
|
||||
- The script will decompress the file (`gzip -c -d "{}"`) to standard output (`-c` parameter) and the output is redirected to `clickhouse-client`.
|
||||
- We also asked to parse [DateTime](../../sql-reference/data-types/datetime.md) fields with extended parser ([--date_time_input_format best_effort](../../operations/settings/settings.md#settings-date_time_input_format)) to recognize ISO-8601 format with timezone offsets.
|
||||
|
||||
For every file, `xargs` will run a script with `bash -c`. The script has substitution in form of `{}` and the `xargs` command will substitute the filename to it (we have asked it for xargs with `-I{}`).
|
||||
|
||||
The script will decompress the file (`gzip -c -d "{}"`) to standard output (`-c` parameter) and the output is redirected to `clickhouse-client`.
|
||||
|
||||
Finally, `clickhouse-client` will do insertion. It will read input data in `CSVWithNames` format. We also asked to parse DateTime fields with extended parser (`--date_time_input_format best_effort`) to recognize ISO-8601 format with timezone offsets.
|
||||
Finally, `clickhouse-client` will do insertion. It will read input data in [CSVWithNames](../../interfaces/formats.md#csvwithnames) format.
|
||||
|
||||
Parallel upload takes 24 seconds.
|
||||
|
||||
If you don't like parallel upload, here is sequential variant:
|
||||
```
|
||||
|
||||
```bash
|
||||
for file in flightlist_*.csv.gz; do gzip -c -d "$file" | clickhouse-client --date_time_input_format best_effort --query "INSERT INTO opensky FORMAT CSVWithNames"; done
|
||||
```
|
||||
|
||||
## Validate the Data
|
||||
## Validate the Data {#validate-data}
|
||||
|
||||
```
|
||||
SELECT count() FROM opensky
|
||||
66010819
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT count() FROM opensky;
|
||||
```
|
||||
|
||||
The size of dataset in ClickHouse is just 2.64 GiB:
|
||||
Result:
|
||||
|
||||
```
|
||||
SELECT formatReadableSize(total_bytes) FROM system.tables WHERE name = 'opensky'
|
||||
2.64 GiB
|
||||
```text
|
||||
┌──count()─┐
|
||||
│ 66010819 │
|
||||
└──────────┘
|
||||
```
|
||||
|
||||
## Run Some Queries
|
||||
The size of dataset in ClickHouse is just 2.66 GiB, check it.
|
||||
|
||||
Total distance travelled is 68 billion kilometers:
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT formatReadableSize(total_bytes) FROM system.tables WHERE name = 'opensky';
|
||||
```
|
||||
SELECT formatReadableQuantity(sum(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)) / 1000) FROM opensky
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─formatReadableSize(total_bytes)─┐
|
||||
│ 2.66 GiB │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Run Some Queries {#run-queries}
|
||||
|
||||
Total distance travelled is 68 billion kilometers.
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT formatReadableQuantity(sum(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)) / 1000) FROM opensky;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─formatReadableQuantity(divide(sum(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)), 1000))─┐
|
||||
│ 68.72 billion │
|
||||
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Average flight distance is around 1000 km.
|
||||
```
|
||||
SELECT avg(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)) FROM opensky
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT avg(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)) FROM opensky;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─avg(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2))─┐
|
||||
│ 1041090.6465708319 │
|
||||
└────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Most busy origin airports and the average distance seen:
|
||||
### Most busy origin airports and the average distance seen {#busy-airports-average-distance}
|
||||
|
||||
```
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
origin,
|
||||
count(),
|
||||
@ -118,10 +152,12 @@ FROM opensky
|
||||
WHERE origin != ''
|
||||
GROUP BY origin
|
||||
ORDER BY count() DESC
|
||||
LIMIT 100
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
Query id: f9010ea5-97d0-45a3-a5bd-9657906cd105
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─origin─┬─count()─┬─distance─┬─bar────────────────────────────────────┐
|
||||
1. │ KORD │ 745007 │ 1546108 │ ███████████████▍ │
|
||||
2. │ KDFW │ 696702 │ 1358721 │ █████████████▌ │
|
||||
@ -224,13 +260,13 @@ Query id: f9010ea5-97d0-45a3-a5bd-9657906cd105
|
||||
99. │ EDDT │ 115122 │ 941740 │ █████████▍ │
|
||||
100. │ EFHK │ 114860 │ 1629143 │ ████████████████▎ │
|
||||
└────────┴─────────┴──────────┴────────────────────────────────────────┘
|
||||
|
||||
100 rows in set. Elapsed: 0.186 sec. Processed 48.31 million rows, 2.17 GB (259.27 million rows/s., 11.67 GB/s.)
|
||||
```
|
||||
|
||||
### Number of flights from three major Moscow airports, weekly:
|
||||
### Number of flights from three major Moscow airports, weekly {#flights-from-moscow}
|
||||
|
||||
```
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
toMonday(day) AS k,
|
||||
count() AS c,
|
||||
@ -238,10 +274,12 @@ SELECT
|
||||
FROM opensky
|
||||
WHERE origin IN ('UUEE', 'UUDD', 'UUWW')
|
||||
GROUP BY k
|
||||
ORDER BY k ASC
|
||||
ORDER BY k ASC;
|
||||
```
|
||||
|
||||
Query id: 1b446157-9519-4cc4-a1cb-178dfcc15a8e
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌──────────k─┬────c─┬─bar──────────────────────────────────────────────────────────────────────────┐
|
||||
1. │ 2018-12-31 │ 5248 │ ████████████████████████████████████████████████████▍ │
|
||||
2. │ 2019-01-07 │ 6302 │ ███████████████████████████████████████████████████████████████ │
|
||||
@ -375,10 +413,8 @@ Query id: 1b446157-9519-4cc4-a1cb-178dfcc15a8e
|
||||
130. │ 2021-06-21 │ 6061 │ ████████████████████████████████████████████████████████████▌ │
|
||||
131. │ 2021-06-28 │ 2554 │ █████████████████████████▌ │
|
||||
└────────────┴──────┴──────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
131 rows in set. Elapsed: 0.014 sec. Processed 655.36 thousand rows, 11.14 MB (47.56 million rows/s., 808.48 MB/s.)
|
||||
```
|
||||
|
||||
### Test it in Playground
|
||||
### Online Playground {#playground}
|
||||
|
||||
The data is uploaded to ClickHouse Playground, [example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUCiAgICBvcmlnaW4sCiAgICBjb3VudCgpLAogICAgcm91bmQoYXZnKGdlb0Rpc3RhbmNlKGxvbmdpdHVkZV8xLCBsYXRpdHVkZV8xLCBsb25naXR1ZGVfMiwgbGF0aXR1ZGVfMikpKSBBUyBkaXN0YW5jZSwKICAgIGJhcihkaXN0YW5jZSwgMCwgMTAwMDAwMDAsIDEwMCkgQVMgYmFyCkZST00gb3BlbnNreQpXSEVSRSBvcmlnaW4gIT0gJycKR1JPVVAgQlkgb3JpZ2luCk9SREVSIEJZIGNvdW50KCkgREVTQwpMSU1JVCAxMDA=).
|
||||
You can test other queries to this data set using the interactive resource [Online Playground](https://gh-api.clickhouse.tech/play?user=play). For example, [like this](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUCiAgICBvcmlnaW4sCiAgICBjb3VudCgpLAogICAgcm91bmQoYXZnKGdlb0Rpc3RhbmNlKGxvbmdpdHVkZV8xLCBsYXRpdHVkZV8xLCBsb25naXR1ZGVfMiwgbGF0aXR1ZGVfMikpKSBBUyBkaXN0YW5jZSwKICAgIGJhcihkaXN0YW5jZSwgMCwgMTAwMDAwMDAsIDEwMCkgQVMgYmFyCkZST00gb3BlbnNreQpXSEVSRSBvcmlnaW4gIT0gJycKR1JPVVAgQlkgb3JpZ2luCk9SREVSIEJZIGNvdW50KCkgREVTQwpMSU1JVCAxMDA=). However, please note that you cannot create temporary tables here.
|
||||
|
@ -3,27 +3,29 @@ toc_priority: 20
|
||||
toc_title: UK Property Price Paid
|
||||
---
|
||||
|
||||
# UK Property Price Paid
|
||||
# UK Property Price Paid {#uk-property-price-paid}
|
||||
|
||||
The dataset contains data about prices paid for real-estate property in England and Wales. The data is available since year 1995.
|
||||
The size of the dataset in uncompressed form is about 4 GiB and it will take about 226 MiB in ClickHouse.
|
||||
The size of the dataset in uncompressed form is about 4 GiB and it will take about 278 MiB in ClickHouse.
|
||||
|
||||
Source: https://www.gov.uk/government/statistical-data-sets/price-paid-data-downloads
|
||||
Description of the fields: https://www.gov.uk/guidance/about-the-price-paid-data
|
||||
|
||||
Contains HM Land Registry data © Crown copyright and database right 2021. This data is licensed under the Open Government Licence v3.0.
|
||||
|
||||
## Download the Dataset
|
||||
## Download the Dataset {#download-dataset}
|
||||
|
||||
```
|
||||
Run the command:
|
||||
|
||||
```bash
|
||||
wget http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-complete.csv
|
||||
```
|
||||
|
||||
Download will take about 2 minutes with good internet connection.
|
||||
|
||||
## Create the Table
|
||||
## Create the Table {#create-table}
|
||||
|
||||
```
|
||||
```sql
|
||||
CREATE TABLE uk_price_paid
|
||||
(
|
||||
price UInt32,
|
||||
@ -44,7 +46,7 @@ CREATE TABLE uk_price_paid
|
||||
) ENGINE = MergeTree ORDER BY (postcode1, postcode2, addr1, addr2);
|
||||
```
|
||||
|
||||
## Preprocess and Import Data
|
||||
## Preprocess and Import Data {#preprocess-import-data}
|
||||
|
||||
We will use `clickhouse-local` tool for data preprocessing and `clickhouse-client` to upload it.
|
||||
|
||||
@ -53,13 +55,13 @@ In this example, we define the structure of source data from the CSV file and sp
|
||||
The preprocessing is:
|
||||
- splitting the postcode to two different columns `postcode1` and `postcode2` that is better for storage and queries;
|
||||
- coverting the `time` field to date as it only contains 00:00 time;
|
||||
- ignoring the `uuid` field because we don't need it for analysis;
|
||||
- transforming `type` and `duration` to more readable Enum fields with function `transform`;
|
||||
- transforming `is_new` and `category` fields from single-character string (`Y`/`N` and `A`/`B`) to UInt8 field with 0 and 1.
|
||||
- ignoring the [UUid](../../sql-reference/data-types/uuid.md) field because we don't need it for analysis;
|
||||
- transforming `type` and `duration` to more readable Enum fields with function [transform](../../sql-reference/functions/other-functions.md#transform);
|
||||
- transforming `is_new` and `category` fields from single-character string (`Y`/`N` and `A`/`B`) to [UInt8](../../sql-reference/data-types/int-uint.md#uint8-uint16-uint32-uint64-uint256-int8-int16-int32-int64-int128-int256) field with 0 and 1.
|
||||
|
||||
Preprocessed data is piped directly to `clickhouse-client` to be inserted into ClickHouse table in streaming fashion.
|
||||
|
||||
```
|
||||
```bash
|
||||
clickhouse-local --input-format CSV --structure '
|
||||
uuid String,
|
||||
price UInt32,
|
||||
@ -100,103 +102,131 @@ clickhouse-local --input-format CSV --structure '
|
||||
|
||||
It will take about 40 seconds.
|
||||
|
||||
## Validate the Data
|
||||
## Validate the Data {#validate-data}
|
||||
|
||||
```
|
||||
SELECT count() FROM uk_price_paid
|
||||
26248711
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT count() FROM uk_price_paid;
|
||||
```
|
||||
|
||||
The size of dataset in ClickHouse is just 226 MiB:
|
||||
Result:
|
||||
|
||||
```
|
||||
SELECT formatReadableSize(total_bytes) FROM system.tables WHERE name = 'uk_price_paid'
|
||||
226.40 MiB
|
||||
```text
|
||||
┌──count()─┐
|
||||
│ 26321785 │
|
||||
└──────────┘
|
||||
```
|
||||
|
||||
## Run Some Queries
|
||||
The size of dataset in ClickHouse is just 278 MiB, check it.
|
||||
|
||||
### Average price per year:
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT formatReadableSize(total_bytes) FROM system.tables WHERE name = 'uk_price_paid';
|
||||
```
|
||||
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 1000000, 80) FROM uk_price_paid GROUP BY year ORDER BY year
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─formatReadableSize(total_bytes)─┐
|
||||
│ 278.80 MiB │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Run Some Queries {#run-queries}
|
||||
|
||||
### Query 1. Average Price Per Year {#average-price}
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 1000000, 80) FROM uk_price_paid GROUP BY year ORDER BY year;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─year─┬──price─┬─bar(round(avg(price)), 0, 1000000, 80)─┐
|
||||
│ 1995 │ 67932 │ █████▍ │
|
||||
│ 1996 │ 71505 │ █████▋ │
|
||||
│ 1997 │ 78532 │ ██████▎ │
|
||||
│ 1998 │ 85435 │ ██████▋ │
|
||||
│ 1999 │ 96036 │ ███████▋ │
|
||||
│ 2000 │ 107478 │ ████████▌ │
|
||||
│ 2001 │ 118886 │ █████████▌ │
|
||||
│ 2002 │ 137940 │ ███████████ │
|
||||
│ 2003 │ 155888 │ ████████████▍ │
|
||||
│ 1998 │ 85436 │ ██████▋ │
|
||||
│ 1999 │ 96037 │ ███████▋ │
|
||||
│ 2000 │ 107479 │ ████████▌ │
|
||||
│ 2001 │ 118885 │ █████████▌ │
|
||||
│ 2002 │ 137941 │ ███████████ │
|
||||
│ 2003 │ 155889 │ ████████████▍ │
|
||||
│ 2004 │ 178885 │ ██████████████▎ │
|
||||
│ 2005 │ 189350 │ ███████████████▏ │
|
||||
│ 2005 │ 189351 │ ███████████████▏ │
|
||||
│ 2006 │ 203528 │ ████████████████▎ │
|
||||
│ 2007 │ 219377 │ █████████████████▌ │
|
||||
│ 2007 │ 219378 │ █████████████████▌ │
|
||||
│ 2008 │ 217056 │ █████████████████▎ │
|
||||
│ 2009 │ 213419 │ █████████████████ │
|
||||
│ 2010 │ 236110 │ ██████████████████▊ │
|
||||
│ 2011 │ 232804 │ ██████████████████▌ │
|
||||
│ 2012 │ 238366 │ ███████████████████ │
|
||||
│ 2010 │ 236109 │ ██████████████████▊ │
|
||||
│ 2011 │ 232805 │ ██████████████████▌ │
|
||||
│ 2012 │ 238367 │ ███████████████████ │
|
||||
│ 2013 │ 256931 │ ████████████████████▌ │
|
||||
│ 2014 │ 279917 │ ██████████████████████▍ │
|
||||
│ 2015 │ 297264 │ ███████████████████████▋ │
|
||||
│ 2016 │ 313197 │ █████████████████████████ │
|
||||
│ 2017 │ 346070 │ ███████████████████████████▋ │
|
||||
│ 2018 │ 350117 │ ████████████████████████████ │
|
||||
│ 2019 │ 351010 │ ████████████████████████████ │
|
||||
│ 2020 │ 368974 │ █████████████████████████████▌ │
|
||||
│ 2021 │ 384351 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 279915 │ ██████████████████████▍ │
|
||||
│ 2015 │ 297266 │ ███████████████████████▋ │
|
||||
│ 2016 │ 313201 │ █████████████████████████ │
|
||||
│ 2017 │ 346097 │ ███████████████████████████▋ │
|
||||
│ 2018 │ 350116 │ ████████████████████████████ │
|
||||
│ 2019 │ 351013 │ ████████████████████████████ │
|
||||
│ 2020 │ 369420 │ █████████████████████████████▌ │
|
||||
│ 2021 │ 386903 │ ██████████████████████████████▊ │
|
||||
└──────┴────────┴────────────────────────────────────────┘
|
||||
|
||||
27 rows in set. Elapsed: 0.027 sec. Processed 26.25 million rows, 157.49 MB (955.96 million rows/s., 5.74 GB/s.)
|
||||
```
|
||||
|
||||
### Average price per year in London:
|
||||
### Query 2. Average Price per Year in London {#average-price-london}
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 2000000, 100) FROM uk_price_paid WHERE town = 'LONDON' GROUP BY year ORDER BY year;
|
||||
```
|
||||
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 2000000, 100) FROM uk_price_paid WHERE town = 'LONDON' GROUP BY year ORDER BY year
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─year─┬───price─┬─bar(round(avg(price)), 0, 2000000, 100)───────────────┐
|
||||
│ 1995 │ 109112 │ █████▍ │
|
||||
│ 1995 │ 109116 │ █████▍ │
|
||||
│ 1996 │ 118667 │ █████▊ │
|
||||
│ 1997 │ 136518 │ ██████▋ │
|
||||
│ 1998 │ 152983 │ ███████▋ │
|
||||
│ 1999 │ 180633 │ █████████ │
|
||||
│ 2000 │ 215830 │ ██████████▋ │
|
||||
│ 2001 │ 232996 │ ███████████▋ │
|
||||
│ 2002 │ 263672 │ █████████████▏ │
|
||||
│ 1999 │ 180637 │ █████████ │
|
||||
│ 2000 │ 215838 │ ██████████▋ │
|
||||
│ 2001 │ 232994 │ ███████████▋ │
|
||||
│ 2002 │ 263670 │ █████████████▏ │
|
||||
│ 2003 │ 278394 │ █████████████▊ │
|
||||
│ 2004 │ 304665 │ ███████████████▏ │
|
||||
│ 2004 │ 304666 │ ███████████████▏ │
|
||||
│ 2005 │ 322875 │ ████████████████▏ │
|
||||
│ 2006 │ 356192 │ █████████████████▋ │
|
||||
│ 2007 │ 404055 │ ████████████████████▏ │
|
||||
│ 2006 │ 356191 │ █████████████████▋ │
|
||||
│ 2007 │ 404054 │ ████████████████████▏ │
|
||||
│ 2008 │ 420741 │ █████████████████████ │
|
||||
│ 2009 │ 427754 │ █████████████████████▍ │
|
||||
│ 2009 │ 427753 │ █████████████████████▍ │
|
||||
│ 2010 │ 480306 │ ████████████████████████ │
|
||||
│ 2011 │ 496274 │ ████████████████████████▋ │
|
||||
│ 2012 │ 519441 │ █████████████████████████▊ │
|
||||
│ 2013 │ 616209 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 724144 │ ████████████████████████████████████▏ │
|
||||
│ 2015 │ 792112 │ ███████████████████████████████████████▌ │
|
||||
│ 2016 │ 843568 │ ██████████████████████████████████████████▏ │
|
||||
│ 2017 │ 982566 │ █████████████████████████████████████████████████▏ │
|
||||
│ 2018 │ 1016845 │ ██████████████████████████████████████████████████▋ │
|
||||
│ 2019 │ 1043277 │ ████████████████████████████████████████████████████▏ │
|
||||
│ 2020 │ 1003963 │ ██████████████████████████████████████████████████▏ │
|
||||
│ 2021 │ 940794 │ ███████████████████████████████████████████████ │
|
||||
│ 2012 │ 519442 │ █████████████████████████▊ │
|
||||
│ 2013 │ 616212 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 724154 │ ████████████████████████████████████▏ │
|
||||
│ 2015 │ 792129 │ ███████████████████████████████████████▌ │
|
||||
│ 2016 │ 843655 │ ██████████████████████████████████████████▏ │
|
||||
│ 2017 │ 982642 │ █████████████████████████████████████████████████▏ │
|
||||
│ 2018 │ 1016835 │ ██████████████████████████████████████████████████▋ │
|
||||
│ 2019 │ 1042849 │ ████████████████████████████████████████████████████▏ │
|
||||
│ 2020 │ 1011889 │ ██████████████████████████████████████████████████▌ │
|
||||
│ 2021 │ 960343 │ ████████████████████████████████████████████████ │
|
||||
└──────┴─────────┴───────────────────────────────────────────────────────┘
|
||||
|
||||
27 rows in set. Elapsed: 0.024 sec. Processed 26.25 million rows, 76.88 MB (1.08 billion rows/s., 3.15 GB/s.)
|
||||
```
|
||||
|
||||
Something happened in 2013. I don't have a clue. Maybe you have a clue what happened in 2020?
|
||||
|
||||
### The most expensive neighborhoods:
|
||||
### Query 3. The Most Expensive Neighborhoods {#most-expensive-neighborhoods}
|
||||
|
||||
```
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
town,
|
||||
district,
|
||||
@ -210,127 +240,126 @@ GROUP BY
|
||||
district
|
||||
HAVING c >= 100
|
||||
ORDER BY price DESC
|
||||
LIMIT 100
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
|
||||
┌─town─────────────────┬─district───────────────┬────c─┬───price─┬─bar(round(avg(price)), 0, 5000000, 100)────────────────────────────┐
|
||||
│ LONDON │ CITY OF WESTMINSTER │ 3372 │ 3305225 │ ██████████████████████████████████████████████████████████████████ │
|
||||
│ LONDON │ CITY OF LONDON │ 257 │ 3294478 │ █████████████████████████████████████████████████████████████████▊ │
|
||||
│ LONDON │ KENSINGTON AND CHELSEA │ 2367 │ 2342422 │ ██████████████████████████████████████████████▋ │
|
||||
│ LEATHERHEAD │ ELMBRIDGE │ 108 │ 1927143 │ ██████████████████████████████████████▌ │
|
||||
│ VIRGINIA WATER │ RUNNYMEDE │ 142 │ 1868819 │ █████████████████████████████████████▍ │
|
||||
│ LONDON │ CAMDEN │ 2815 │ 1736788 │ ██████████████████████████████████▋ │
|
||||
│ THORNTON HEATH │ CROYDON │ 521 │ 1733051 │ ██████████████████████████████████▋ │
|
||||
│ WINDLESHAM │ SURREY HEATH │ 103 │ 1717255 │ ██████████████████████████████████▎ │
|
||||
│ BARNET │ ENFIELD │ 115 │ 1503458 │ ██████████████████████████████ │
|
||||
│ OXFORD │ SOUTH OXFORDSHIRE │ 298 │ 1275200 │ █████████████████████████▌ │
|
||||
│ LONDON │ ISLINGTON │ 2458 │ 1274308 │ █████████████████████████▍ │
|
||||
│ COBHAM │ ELMBRIDGE │ 364 │ 1260005 │ █████████████████████████▏ │
|
||||
│ LONDON │ HOUNSLOW │ 618 │ 1215682 │ ████████████████████████▎ │
|
||||
│ ASCOT │ WINDSOR AND MAIDENHEAD │ 379 │ 1215146 │ ████████████████████████▎ │
|
||||
│ LONDON │ RICHMOND UPON THAMES │ 654 │ 1207551 │ ████████████████████████▏ │
|
||||
│ BEACONSFIELD │ BUCKINGHAMSHIRE │ 307 │ 1186220 │ ███████████████████████▋ │
|
||||
│ RICHMOND │ RICHMOND UPON THAMES │ 805 │ 1100420 │ ██████████████████████ │
|
||||
│ LONDON │ HAMMERSMITH AND FULHAM │ 2888 │ 1062959 │ █████████████████████▎ │
|
||||
│ WEYBRIDGE │ ELMBRIDGE │ 607 │ 1027161 │ ████████████████████▌ │
|
||||
│ RADLETT │ HERTSMERE │ 265 │ 1015896 │ ████████████████████▎ │
|
||||
│ SALCOMBE │ SOUTH HAMS │ 124 │ 1014393 │ ████████████████████▎ │
|
||||
│ BURFORD │ WEST OXFORDSHIRE │ 102 │ 993100 │ ███████████████████▋ │
|
||||
│ ESHER │ ELMBRIDGE │ 454 │ 969770 │ ███████████████████▍ │
|
||||
│ HINDHEAD │ WAVERLEY │ 128 │ 967786 │ ███████████████████▎ │
|
||||
│ BROCKENHURST │ NEW FOREST │ 121 │ 967046 │ ███████████████████▎ │
|
||||
│ LEATHERHEAD │ GUILDFORD │ 191 │ 964489 │ ███████████████████▎ │
|
||||
│ GERRARDS CROSS │ BUCKINGHAMSHIRE │ 376 │ 958555 │ ███████████████████▏ │
|
||||
│ EAST MOLESEY │ ELMBRIDGE │ 181 │ 943457 │ ██████████████████▋ │
|
||||
│ OLNEY │ MILTON KEYNES │ 220 │ 942892 │ ██████████████████▋ │
|
||||
│ CHALFONT ST GILES │ BUCKINGHAMSHIRE │ 135 │ 926950 │ ██████████████████▌ │
|
||||
│ HENLEY-ON-THAMES │ SOUTH OXFORDSHIRE │ 509 │ 905732 │ ██████████████████ │
|
||||
│ KINGSTON UPON THAMES │ KINGSTON UPON THAMES │ 889 │ 899689 │ █████████████████▊ │
|
||||
│ BELVEDERE │ BEXLEY │ 313 │ 895336 │ █████████████████▊ │
|
||||
│ CRANBROOK │ TUNBRIDGE WELLS │ 404 │ 888190 │ █████████████████▋ │
|
||||
│ LONDON │ EALING │ 2460 │ 865893 │ █████████████████▎ │
|
||||
│ MAIDENHEAD │ BUCKINGHAMSHIRE │ 114 │ 863814 │ █████████████████▎ │
|
||||
│ LONDON │ MERTON │ 1958 │ 857192 │ █████████████████▏ │
|
||||
│ GUILDFORD │ WAVERLEY │ 131 │ 854447 │ █████████████████ │
|
||||
│ LONDON │ HACKNEY │ 3088 │ 846571 │ ████████████████▊ │
|
||||
│ LYMM │ WARRINGTON │ 285 │ 839920 │ ████████████████▋ │
|
||||
│ HARPENDEN │ ST ALBANS │ 606 │ 836994 │ ████████████████▋ │
|
||||
│ LONDON │ WANDSWORTH │ 6113 │ 832292 │ ████████████████▋ │
|
||||
│ LONDON │ SOUTHWARK │ 3612 │ 831319 │ ████████████████▋ │
|
||||
│ BERKHAMSTED │ DACORUM │ 502 │ 830356 │ ████████████████▌ │
|
||||
│ KINGS LANGLEY │ DACORUM │ 137 │ 821358 │ ████████████████▍ │
|
||||
│ TONBRIDGE │ TUNBRIDGE WELLS │ 339 │ 806736 │ ████████████████▏ │
|
||||
│ EPSOM │ REIGATE AND BANSTEAD │ 157 │ 805903 │ ████████████████ │
|
||||
│ WOKING │ GUILDFORD │ 161 │ 803283 │ ████████████████ │
|
||||
│ STOCKBRIDGE │ TEST VALLEY │ 168 │ 801973 │ ████████████████ │
|
||||
│ TEDDINGTON │ RICHMOND UPON THAMES │ 539 │ 798591 │ ███████████████▊ │
|
||||
│ OXFORD │ VALE OF WHITE HORSE │ 329 │ 792907 │ ███████████████▋ │
|
||||
│ LONDON │ BARNET │ 3624 │ 789583 │ ███████████████▋ │
|
||||
│ TWICKENHAM │ RICHMOND UPON THAMES │ 1090 │ 787760 │ ███████████████▋ │
|
||||
│ LUTON │ CENTRAL BEDFORDSHIRE │ 196 │ 786051 │ ███████████████▋ │
|
||||
│ TONBRIDGE │ MAIDSTONE │ 277 │ 785746 │ ███████████████▋ │
|
||||
│ TOWCESTER │ WEST NORTHAMPTONSHIRE │ 186 │ 783532 │ ███████████████▋ │
|
||||
│ LONDON │ LAMBETH │ 4832 │ 783422 │ ███████████████▋ │
|
||||
│ LUTTERWORTH │ HARBOROUGH │ 515 │ 781775 │ ███████████████▋ │
|
||||
│ WOODSTOCK │ WEST OXFORDSHIRE │ 135 │ 777499 │ ███████████████▌ │
|
||||
│ ALRESFORD │ WINCHESTER │ 196 │ 775577 │ ███████████████▌ │
|
||||
│ LONDON │ NEWHAM │ 2942 │ 768551 │ ███████████████▎ │
|
||||
│ ALDERLEY EDGE │ CHESHIRE EAST │ 168 │ 768280 │ ███████████████▎ │
|
||||
│ MARLOW │ BUCKINGHAMSHIRE │ 301 │ 762784 │ ███████████████▎ │
|
||||
│ BILLINGSHURST │ CHICHESTER │ 134 │ 760920 │ ███████████████▏ │
|
||||
│ LONDON │ TOWER HAMLETS │ 4183 │ 759635 │ ███████████████▏ │
|
||||
│ MIDHURST │ CHICHESTER │ 245 │ 759101 │ ███████████████▏ │
|
||||
│ THAMES DITTON │ ELMBRIDGE │ 227 │ 753347 │ ███████████████ │
|
||||
│ POTTERS BAR │ WELWYN HATFIELD │ 163 │ 752926 │ ███████████████ │
|
||||
│ REIGATE │ REIGATE AND BANSTEAD │ 555 │ 740961 │ ██████████████▋ │
|
||||
│ TADWORTH │ REIGATE AND BANSTEAD │ 477 │ 738997 │ ██████████████▋ │
|
||||
│ SEVENOAKS │ SEVENOAKS │ 1074 │ 734658 │ ██████████████▋ │
|
||||
│ PETWORTH │ CHICHESTER │ 138 │ 732432 │ ██████████████▋ │
|
||||
│ BOURNE END │ BUCKINGHAMSHIRE │ 127 │ 730742 │ ██████████████▌ │
|
||||
│ PURLEY │ CROYDON │ 540 │ 727721 │ ██████████████▌ │
|
||||
│ OXTED │ TANDRIDGE │ 320 │ 726078 │ ██████████████▌ │
|
||||
│ LONDON │ HARINGEY │ 2988 │ 724573 │ ██████████████▍ │
|
||||
│ BANSTEAD │ REIGATE AND BANSTEAD │ 373 │ 713834 │ ██████████████▎ │
|
||||
│ PINNER │ HARROW │ 480 │ 712166 │ ██████████████▏ │
|
||||
│ MALMESBURY │ WILTSHIRE │ 293 │ 707747 │ ██████████████▏ │
|
||||
│ RICKMANSWORTH │ THREE RIVERS │ 732 │ 705400 │ ██████████████ │
|
||||
│ SLOUGH │ BUCKINGHAMSHIRE │ 359 │ 705002 │ ██████████████ │
|
||||
│ GREAT MISSENDEN │ BUCKINGHAMSHIRE │ 214 │ 704904 │ ██████████████ │
|
||||
│ READING │ SOUTH OXFORDSHIRE │ 295 │ 701697 │ ██████████████ │
|
||||
│ HYTHE │ FOLKESTONE AND HYTHE │ 457 │ 700334 │ ██████████████ │
|
||||
│ WELWYN │ WELWYN HATFIELD │ 217 │ 699649 │ █████████████▊ │
|
||||
│ CHIGWELL │ EPPING FOREST │ 242 │ 697869 │ █████████████▊ │
|
||||
│ BARNET │ BARNET │ 906 │ 695680 │ █████████████▊ │
|
||||
│ HASLEMERE │ CHICHESTER │ 120 │ 694028 │ █████████████▊ │
|
||||
│ LEATHERHEAD │ MOLE VALLEY │ 748 │ 692026 │ █████████████▋ │
|
||||
│ LONDON │ BRENT │ 1945 │ 690799 │ █████████████▋ │
|
||||
│ HASLEMERE │ WAVERLEY │ 258 │ 690765 │ █████████████▋ │
|
||||
│ NORTHWOOD │ HILLINGDON │ 252 │ 690753 │ █████████████▋ │
|
||||
│ WALTON-ON-THAMES │ ELMBRIDGE │ 871 │ 689431 │ █████████████▋ │
|
||||
│ INGATESTONE │ BRENTWOOD │ 150 │ 688345 │ █████████████▋ │
|
||||
│ OXFORD │ OXFORD │ 1761 │ 686114 │ █████████████▋ │
|
||||
│ CHISLEHURST │ BROMLEY │ 410 │ 682892 │ █████████████▋ │
|
||||
│ KINGS LANGLEY │ THREE RIVERS │ 109 │ 682320 │ █████████████▋ │
|
||||
│ ASHTEAD │ MOLE VALLEY │ 280 │ 680483 │ █████████████▌ │
|
||||
│ WOKING │ SURREY HEATH │ 269 │ 679035 │ █████████████▌ │
|
||||
│ ASCOT │ BRACKNELL FOREST │ 160 │ 678632 │ █████████████▌ │
|
||||
│ LONDON │ CITY OF WESTMINSTER │ 3606 │ 3280239 │ █████████████████████████████████████████████████████████████████▌ │
|
||||
│ LONDON │ CITY OF LONDON │ 274 │ 3160502 │ ███████████████████████████████████████████████████████████████▏ │
|
||||
│ LONDON │ KENSINGTON AND CHELSEA │ 2550 │ 2308478 │ ██████████████████████████████████████████████▏ │
|
||||
│ LEATHERHEAD │ ELMBRIDGE │ 114 │ 1897407 │ █████████████████████████████████████▊ │
|
||||
│ LONDON │ CAMDEN │ 3033 │ 1805404 │ ████████████████████████████████████ │
|
||||
│ VIRGINIA WATER │ RUNNYMEDE │ 156 │ 1753247 │ ███████████████████████████████████ │
|
||||
│ WINDLESHAM │ SURREY HEATH │ 108 │ 1677613 │ █████████████████████████████████▌ │
|
||||
│ THORNTON HEATH │ CROYDON │ 546 │ 1671721 │ █████████████████████████████████▍ │
|
||||
│ BARNET │ ENFIELD │ 124 │ 1505840 │ ██████████████████████████████ │
|
||||
│ COBHAM │ ELMBRIDGE │ 387 │ 1237250 │ ████████████████████████▋ │
|
||||
│ LONDON │ ISLINGTON │ 2668 │ 1236980 │ ████████████████████████▋ │
|
||||
│ OXFORD │ SOUTH OXFORDSHIRE │ 321 │ 1220907 │ ████████████████████████▍ │
|
||||
│ LONDON │ RICHMOND UPON THAMES │ 704 │ 1215551 │ ████████████████████████▎ │
|
||||
│ LONDON │ HOUNSLOW │ 671 │ 1207493 │ ████████████████████████▏ │
|
||||
│ ASCOT │ WINDSOR AND MAIDENHEAD │ 407 │ 1183299 │ ███████████████████████▋ │
|
||||
│ BEACONSFIELD │ BUCKINGHAMSHIRE │ 330 │ 1175615 │ ███████████████████████▌ │
|
||||
│ RICHMOND │ RICHMOND UPON THAMES │ 874 │ 1110444 │ ██████████████████████▏ │
|
||||
│ LONDON │ HAMMERSMITH AND FULHAM │ 3086 │ 1053983 │ █████████████████████ │
|
||||
│ SURBITON │ ELMBRIDGE │ 100 │ 1011800 │ ████████████████████▏ │
|
||||
│ RADLETT │ HERTSMERE │ 283 │ 1011712 │ ████████████████████▏ │
|
||||
│ SALCOMBE │ SOUTH HAMS │ 127 │ 1011624 │ ████████████████████▏ │
|
||||
│ WEYBRIDGE │ ELMBRIDGE │ 655 │ 1007265 │ ████████████████████▏ │
|
||||
│ ESHER │ ELMBRIDGE │ 485 │ 986581 │ ███████████████████▋ │
|
||||
│ LEATHERHEAD │ GUILDFORD │ 202 │ 977320 │ ███████████████████▌ │
|
||||
│ BURFORD │ WEST OXFORDSHIRE │ 111 │ 966893 │ ███████████████████▎ │
|
||||
│ BROCKENHURST │ NEW FOREST │ 129 │ 956675 │ ███████████████████▏ │
|
||||
│ HINDHEAD │ WAVERLEY │ 137 │ 953753 │ ███████████████████ │
|
||||
│ GERRARDS CROSS │ BUCKINGHAMSHIRE │ 419 │ 951121 │ ███████████████████ │
|
||||
│ EAST MOLESEY │ ELMBRIDGE │ 192 │ 936769 │ ██████████████████▋ │
|
||||
│ CHALFONT ST GILES │ BUCKINGHAMSHIRE │ 146 │ 925515 │ ██████████████████▌ │
|
||||
│ LONDON │ TOWER HAMLETS │ 4388 │ 918304 │ ██████████████████▎ │
|
||||
│ OLNEY │ MILTON KEYNES │ 235 │ 910646 │ ██████████████████▏ │
|
||||
│ HENLEY-ON-THAMES │ SOUTH OXFORDSHIRE │ 540 │ 902418 │ ██████████████████ │
|
||||
│ LONDON │ SOUTHWARK │ 3885 │ 892997 │ █████████████████▋ │
|
||||
│ KINGSTON UPON THAMES │ KINGSTON UPON THAMES │ 960 │ 885969 │ █████████████████▋ │
|
||||
│ LONDON │ EALING │ 2658 │ 871755 │ █████████████████▍ │
|
||||
│ CRANBROOK │ TUNBRIDGE WELLS │ 431 │ 862348 │ █████████████████▏ │
|
||||
│ LONDON │ MERTON │ 2099 │ 859118 │ █████████████████▏ │
|
||||
│ BELVEDERE │ BEXLEY │ 346 │ 842423 │ ████████████████▋ │
|
||||
│ GUILDFORD │ WAVERLEY │ 143 │ 841277 │ ████████████████▋ │
|
||||
│ HARPENDEN │ ST ALBANS │ 657 │ 841216 │ ████████████████▋ │
|
||||
│ LONDON │ HACKNEY │ 3307 │ 837090 │ ████████████████▋ │
|
||||
│ LONDON │ WANDSWORTH │ 6566 │ 832663 │ ████████████████▋ │
|
||||
│ MAIDENHEAD │ BUCKINGHAMSHIRE │ 123 │ 824299 │ ████████████████▍ │
|
||||
│ KINGS LANGLEY │ DACORUM │ 145 │ 821331 │ ████████████████▍ │
|
||||
│ BERKHAMSTED │ DACORUM │ 543 │ 818415 │ ████████████████▎ │
|
||||
│ GREAT MISSENDEN │ BUCKINGHAMSHIRE │ 226 │ 802807 │ ████████████████ │
|
||||
│ BILLINGSHURST │ CHICHESTER │ 144 │ 797829 │ ███████████████▊ │
|
||||
│ WOKING │ GUILDFORD │ 176 │ 793494 │ ███████████████▋ │
|
||||
│ STOCKBRIDGE │ TEST VALLEY │ 178 │ 793269 │ ███████████████▋ │
|
||||
│ EPSOM │ REIGATE AND BANSTEAD │ 172 │ 791862 │ ███████████████▋ │
|
||||
│ TONBRIDGE │ TUNBRIDGE WELLS │ 360 │ 787876 │ ███████████████▋ │
|
||||
│ TEDDINGTON │ RICHMOND UPON THAMES │ 595 │ 786492 │ ███████████████▋ │
|
||||
│ TWICKENHAM │ RICHMOND UPON THAMES │ 1155 │ 786193 │ ███████████████▋ │
|
||||
│ LYNDHURST │ NEW FOREST │ 102 │ 785593 │ ███████████████▋ │
|
||||
│ LONDON │ LAMBETH │ 5228 │ 774574 │ ███████████████▍ │
|
||||
│ LONDON │ BARNET │ 3955 │ 773259 │ ███████████████▍ │
|
||||
│ OXFORD │ VALE OF WHITE HORSE │ 353 │ 772088 │ ███████████████▍ │
|
||||
│ TONBRIDGE │ MAIDSTONE │ 305 │ 770740 │ ███████████████▍ │
|
||||
│ LUTTERWORTH │ HARBOROUGH │ 538 │ 768634 │ ███████████████▎ │
|
||||
│ WOODSTOCK │ WEST OXFORDSHIRE │ 140 │ 766037 │ ███████████████▎ │
|
||||
│ MIDHURST │ CHICHESTER │ 257 │ 764815 │ ███████████████▎ │
|
||||
│ MARLOW │ BUCKINGHAMSHIRE │ 327 │ 761876 │ ███████████████▏ │
|
||||
│ LONDON │ NEWHAM │ 3237 │ 761784 │ ███████████████▏ │
|
||||
│ ALDERLEY EDGE │ CHESHIRE EAST │ 178 │ 757318 │ ███████████████▏ │
|
||||
│ LUTON │ CENTRAL BEDFORDSHIRE │ 212 │ 754283 │ ███████████████ │
|
||||
│ PETWORTH │ CHICHESTER │ 154 │ 754220 │ ███████████████ │
|
||||
│ ALRESFORD │ WINCHESTER │ 219 │ 752718 │ ███████████████ │
|
||||
│ POTTERS BAR │ WELWYN HATFIELD │ 174 │ 748465 │ ██████████████▊ │
|
||||
│ HASLEMERE │ CHICHESTER │ 128 │ 746907 │ ██████████████▊ │
|
||||
│ TADWORTH │ REIGATE AND BANSTEAD │ 502 │ 743252 │ ██████████████▋ │
|
||||
│ THAMES DITTON │ ELMBRIDGE │ 244 │ 741913 │ ██████████████▋ │
|
||||
│ REIGATE │ REIGATE AND BANSTEAD │ 581 │ 738198 │ ██████████████▋ │
|
||||
│ BOURNE END │ BUCKINGHAMSHIRE │ 138 │ 735190 │ ██████████████▋ │
|
||||
│ SEVENOAKS │ SEVENOAKS │ 1156 │ 730018 │ ██████████████▌ │
|
||||
│ OXTED │ TANDRIDGE │ 336 │ 729123 │ ██████████████▌ │
|
||||
│ INGATESTONE │ BRENTWOOD │ 166 │ 728103 │ ██████████████▌ │
|
||||
│ LONDON │ BRENT │ 2079 │ 720605 │ ██████████████▍ │
|
||||
│ LONDON │ HARINGEY │ 3216 │ 717780 │ ██████████████▎ │
|
||||
│ PURLEY │ CROYDON │ 575 │ 716108 │ ██████████████▎ │
|
||||
│ WELWYN │ WELWYN HATFIELD │ 222 │ 710603 │ ██████████████▏ │
|
||||
│ RICKMANSWORTH │ THREE RIVERS │ 798 │ 704571 │ ██████████████ │
|
||||
│ BANSTEAD │ REIGATE AND BANSTEAD │ 401 │ 701293 │ ██████████████ │
|
||||
│ CHIGWELL │ EPPING FOREST │ 261 │ 701203 │ ██████████████ │
|
||||
│ PINNER │ HARROW │ 528 │ 698885 │ █████████████▊ │
|
||||
│ HASLEMERE │ WAVERLEY │ 280 │ 696659 │ █████████████▊ │
|
||||
│ SLOUGH │ BUCKINGHAMSHIRE │ 396 │ 694917 │ █████████████▊ │
|
||||
│ WALTON-ON-THAMES │ ELMBRIDGE │ 946 │ 692395 │ █████████████▋ │
|
||||
│ READING │ SOUTH OXFORDSHIRE │ 318 │ 691988 │ █████████████▋ │
|
||||
│ NORTHWOOD │ HILLINGDON │ 271 │ 690643 │ █████████████▋ │
|
||||
│ FELTHAM │ HOUNSLOW │ 763 │ 688595 │ █████████████▋ │
|
||||
│ ASHTEAD │ MOLE VALLEY │ 303 │ 687923 │ █████████████▋ │
|
||||
│ BARNET │ BARNET │ 975 │ 686980 │ █████████████▋ │
|
||||
│ WOKING │ SURREY HEATH │ 283 │ 686669 │ █████████████▋ │
|
||||
│ MALMESBURY │ WILTSHIRE │ 323 │ 683324 │ █████████████▋ │
|
||||
│ AMERSHAM │ BUCKINGHAMSHIRE │ 496 │ 680962 │ █████████████▌ │
|
||||
│ CHISLEHURST │ BROMLEY │ 430 │ 680209 │ █████████████▌ │
|
||||
│ HYTHE │ FOLKESTONE AND HYTHE │ 490 │ 676908 │ █████████████▌ │
|
||||
│ MAYFIELD │ WEALDEN │ 101 │ 676210 │ █████████████▌ │
|
||||
│ ASCOT │ BRACKNELL FOREST │ 168 │ 676004 │ █████████████▌ │
|
||||
└──────────────────────┴────────────────────────┴──────┴─────────┴────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
100 rows in set. Elapsed: 0.039 sec. Processed 26.25 million rows, 278.03 MB (674.32 million rows/s., 7.14 GB/s.)
|
||||
```
|
||||
|
||||
### Test it in Playground
|
||||
## Let's Speed Up Queries Using Projections {#speedup-with-projections}
|
||||
|
||||
The data is uploaded to ClickHouse Playground, [example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIHRvd24sIGRpc3RyaWN0LCBjb3VudCgpIEFTIGMsIHJvdW5kKGF2ZyhwcmljZSkpIEFTIHByaWNlLCBiYXIocHJpY2UsIDAsIDUwMDAwMDAsIDEwMCkgRlJPTSB1a19wcmljZV9wYWlkIFdIRVJFIGRhdGUgPj0gJzIwMjAtMDEtMDEnIEdST1VQIEJZIHRvd24sIGRpc3RyaWN0IEhBVklORyBjID49IDEwMCBPUkRFUiBCWSBwcmljZSBERVNDIExJTUlUIDEwMA==).
|
||||
[Projections](../../sql-reference/statements/alter/projection.md) allow to improve queries speed by storing pre-aggregated data.
|
||||
|
||||
## Let's speed up queries using projections
|
||||
### Build a Projection {#build-projection}
|
||||
|
||||
[Projections](https://../../sql-reference/statements/alter/projection/) allow to improve queries speed by storing pre-aggregated data.
|
||||
|
||||
### Build a projection
|
||||
|
||||
```
|
||||
-- create an aggregate projection by dimensions (toYear(date), district, town)
|
||||
Create an aggregate projection by dimensions `toYear(date)`, `district`, `town`:
|
||||
|
||||
```sql
|
||||
ALTER TABLE uk_price_paid
|
||||
ADD PROJECTION projection_by_year_district_town
|
||||
(
|
||||
@ -346,25 +375,31 @@ ALTER TABLE uk_price_paid
|
||||
district,
|
||||
town
|
||||
);
|
||||
```
|
||||
|
||||
-- populate the projection for existing data (without it projection will be
|
||||
-- created for only newly inserted data)
|
||||
Populate the projection for existing data (without it projection will be created for only newly inserted data):
|
||||
|
||||
```sql
|
||||
ALTER TABLE uk_price_paid
|
||||
MATERIALIZE PROJECTION projection_by_year_district_town
|
||||
SETTINGS mutations_sync = 1;
|
||||
```
|
||||
|
||||
## Test performance
|
||||
## Test Performance {#test-performance}
|
||||
|
||||
Let's run the same 3 queries.
|
||||
|
||||
[Enable](../../operations/settings/settings.md#allow-experimental-projection-optimization) projections for selects:
|
||||
|
||||
```sql
|
||||
SET allow_experimental_projection_optimization = 1;
|
||||
```
|
||||
-- enable projections for selects
|
||||
set allow_experimental_projection_optimization=1;
|
||||
|
||||
-- Q1) Average price per year:
|
||||
### Query 1. Average Price Per Year {#average-price-projections}
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
toYear(date) AS year,
|
||||
round(avg(price)) AS price,
|
||||
@ -372,41 +407,47 @@ SELECT
|
||||
FROM uk_price_paid
|
||||
GROUP BY year
|
||||
ORDER BY year ASC;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─year─┬──price─┬─bar(round(avg(price)), 0, 1000000, 80)─┐
|
||||
│ 1995 │ 67932 │ █████▍ │
|
||||
│ 1996 │ 71505 │ █████▋ │
|
||||
│ 1997 │ 78532 │ ██████▎ │
|
||||
│ 1998 │ 85435 │ ██████▋ │
|
||||
│ 1999 │ 96036 │ ███████▋ │
|
||||
│ 2000 │ 107478 │ ████████▌ │
|
||||
│ 2001 │ 118886 │ █████████▌ │
|
||||
│ 2002 │ 137940 │ ███████████ │
|
||||
│ 2003 │ 155888 │ ████████████▍ │
|
||||
│ 1998 │ 85436 │ ██████▋ │
|
||||
│ 1999 │ 96037 │ ███████▋ │
|
||||
│ 2000 │ 107479 │ ████████▌ │
|
||||
│ 2001 │ 118885 │ █████████▌ │
|
||||
│ 2002 │ 137941 │ ███████████ │
|
||||
│ 2003 │ 155889 │ ████████████▍ │
|
||||
│ 2004 │ 178885 │ ██████████████▎ │
|
||||
│ 2005 │ 189350 │ ███████████████▏ │
|
||||
│ 2005 │ 189351 │ ███████████████▏ │
|
||||
│ 2006 │ 203528 │ ████████████████▎ │
|
||||
│ 2007 │ 219377 │ █████████████████▌ │
|
||||
│ 2007 │ 219378 │ █████████████████▌ │
|
||||
│ 2008 │ 217056 │ █████████████████▎ │
|
||||
│ 2009 │ 213419 │ █████████████████ │
|
||||
│ 2010 │ 236110 │ ██████████████████▊ │
|
||||
│ 2011 │ 232804 │ ██████████████████▌ │
|
||||
│ 2012 │ 238366 │ ███████████████████ │
|
||||
│ 2010 │ 236109 │ ██████████████████▊ │
|
||||
│ 2011 │ 232805 │ ██████████████████▌ │
|
||||
│ 2012 │ 238367 │ ███████████████████ │
|
||||
│ 2013 │ 256931 │ ████████████████████▌ │
|
||||
│ 2014 │ 279917 │ ██████████████████████▍ │
|
||||
│ 2015 │ 297264 │ ███████████████████████▋ │
|
||||
│ 2016 │ 313197 │ █████████████████████████ │
|
||||
│ 2017 │ 346070 │ ███████████████████████████▋ │
|
||||
│ 2018 │ 350117 │ ████████████████████████████ │
|
||||
│ 2019 │ 351010 │ ████████████████████████████ │
|
||||
│ 2020 │ 368974 │ █████████████████████████████▌ │
|
||||
│ 2021 │ 384351 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 279915 │ ██████████████████████▍ │
|
||||
│ 2015 │ 297266 │ ███████████████████████▋ │
|
||||
│ 2016 │ 313201 │ █████████████████████████ │
|
||||
│ 2017 │ 346097 │ ███████████████████████████▋ │
|
||||
│ 2018 │ 350116 │ ████████████████████████████ │
|
||||
│ 2019 │ 351013 │ ████████████████████████████ │
|
||||
│ 2020 │ 369420 │ █████████████████████████████▌ │
|
||||
│ 2021 │ 386903 │ ██████████████████████████████▊ │
|
||||
└──────┴────────┴────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
27 rows in set. Elapsed: 0.003 sec. Processed 106.87 thousand rows, 3.21 MB (31.92 million rows/s., 959.03 MB/s.)
|
||||
### Query 2. Average Price Per Year in London {#average-price-london-projections}
|
||||
|
||||
-- Q2) Average price per year in London:
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
toYear(date) AS year,
|
||||
round(avg(price)) AS price,
|
||||
@ -415,42 +456,49 @@ FROM uk_price_paid
|
||||
WHERE town = 'LONDON'
|
||||
GROUP BY year
|
||||
ORDER BY year ASC;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─year─┬───price─┬─bar(round(avg(price)), 0, 2000000, 100)───────────────┐
|
||||
│ 1995 │ 109112 │ █████▍ │
|
||||
│ 1995 │ 109116 │ █████▍ │
|
||||
│ 1996 │ 118667 │ █████▊ │
|
||||
│ 1997 │ 136518 │ ██████▋ │
|
||||
│ 1998 │ 152983 │ ███████▋ │
|
||||
│ 1999 │ 180633 │ █████████ │
|
||||
│ 2000 │ 215830 │ ██████████▋ │
|
||||
│ 2001 │ 232996 │ ███████████▋ │
|
||||
│ 2002 │ 263672 │ █████████████▏ │
|
||||
│ 1999 │ 180637 │ █████████ │
|
||||
│ 2000 │ 215838 │ ██████████▋ │
|
||||
│ 2001 │ 232994 │ ███████████▋ │
|
||||
│ 2002 │ 263670 │ █████████████▏ │
|
||||
│ 2003 │ 278394 │ █████████████▊ │
|
||||
│ 2004 │ 304665 │ ███████████████▏ │
|
||||
│ 2004 │ 304666 │ ███████████████▏ │
|
||||
│ 2005 │ 322875 │ ████████████████▏ │
|
||||
│ 2006 │ 356192 │ █████████████████▋ │
|
||||
│ 2007 │ 404055 │ ████████████████████▏ │
|
||||
│ 2006 │ 356191 │ █████████████████▋ │
|
||||
│ 2007 │ 404054 │ ████████████████████▏ │
|
||||
│ 2008 │ 420741 │ █████████████████████ │
|
||||
│ 2009 │ 427754 │ █████████████████████▍ │
|
||||
│ 2009 │ 427753 │ █████████████████████▍ │
|
||||
│ 2010 │ 480306 │ ████████████████████████ │
|
||||
│ 2011 │ 496274 │ ████████████████████████▋ │
|
||||
│ 2012 │ 519441 │ █████████████████████████▊ │
|
||||
│ 2013 │ 616209 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 724144 │ ████████████████████████████████████▏ │
|
||||
│ 2015 │ 792112 │ ███████████████████████████████████████▌ │
|
||||
│ 2016 │ 843568 │ ██████████████████████████████████████████▏ │
|
||||
│ 2017 │ 982566 │ █████████████████████████████████████████████████▏ │
|
||||
│ 2018 │ 1016845 │ ██████████████████████████████████████████████████▋ │
|
||||
│ 2019 │ 1043277 │ ████████████████████████████████████████████████████▏ │
|
||||
│ 2020 │ 1003963 │ ██████████████████████████████████████████████████▏ │
|
||||
│ 2021 │ 940794 │ ███████████████████████████████████████████████ │
|
||||
│ 2012 │ 519442 │ █████████████████████████▊ │
|
||||
│ 2013 │ 616212 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 724154 │ ████████████████████████████████████▏ │
|
||||
│ 2015 │ 792129 │ ███████████████████████████████████████▌ │
|
||||
│ 2016 │ 843655 │ ██████████████████████████████████████████▏ │
|
||||
│ 2017 │ 982642 │ █████████████████████████████████████████████████▏ │
|
||||
│ 2018 │ 1016835 │ ██████████████████████████████████████████████████▋ │
|
||||
│ 2019 │ 1042849 │ ████████████████████████████████████████████████████▏ │
|
||||
│ 2020 │ 1011889 │ ██████████████████████████████████████████████████▌ │
|
||||
│ 2021 │ 960343 │ ████████████████████████████████████████████████ │
|
||||
└──────┴─────────┴───────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
27 rows in set. Elapsed: 0.005 sec. Processed 106.87 thousand rows, 3.53 MB (23.49 million rows/s., 775.95 MB/s.)
|
||||
### Query 3. The Most Expensive Neighborhoods {#most-expensive-neighborhoods-projections}
|
||||
|
||||
-- Q3) The most expensive neighborhoods:
|
||||
-- the condition (date >= '2020-01-01') needs to be modified to match projection dimension (toYear(date) >= 2020)
|
||||
The condition (date >= '2020-01-01') needs to be modified to match projection dimension (toYear(date) >= 2020).
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
town,
|
||||
district,
|
||||
@ -464,118 +512,138 @@ GROUP BY
|
||||
district
|
||||
HAVING c >= 100
|
||||
ORDER BY price DESC
|
||||
LIMIT 100
|
||||
|
||||
┌─town─────────────────┬─district───────────────┬────c─┬───price─┬─bar(round(avg(price)), 0, 5000000, 100)────────────────────────────┐
|
||||
│ LONDON │ CITY OF WESTMINSTER │ 3372 │ 3305225 │ ██████████████████████████████████████████████████████████████████ │
|
||||
│ LONDON │ CITY OF LONDON │ 257 │ 3294478 │ █████████████████████████████████████████████████████████████████▊ │
|
||||
│ LONDON │ KENSINGTON AND CHELSEA │ 2367 │ 2342422 │ ██████████████████████████████████████████████▋ │
|
||||
│ LEATHERHEAD │ ELMBRIDGE │ 108 │ 1927143 │ ██████████████████████████████████████▌ │
|
||||
│ VIRGINIA WATER │ RUNNYMEDE │ 142 │ 1868819 │ █████████████████████████████████████▍ │
|
||||
│ LONDON │ CAMDEN │ 2815 │ 1736788 │ ██████████████████████████████████▋ │
|
||||
│ THORNTON HEATH │ CROYDON │ 521 │ 1733051 │ ██████████████████████████████████▋ │
|
||||
│ WINDLESHAM │ SURREY HEATH │ 103 │ 1717255 │ ██████████████████████████████████▎ │
|
||||
│ BARNET │ ENFIELD │ 115 │ 1503458 │ ██████████████████████████████ │
|
||||
│ OXFORD │ SOUTH OXFORDSHIRE │ 298 │ 1275200 │ █████████████████████████▌ │
|
||||
│ LONDON │ ISLINGTON │ 2458 │ 1274308 │ █████████████████████████▍ │
|
||||
│ COBHAM │ ELMBRIDGE │ 364 │ 1260005 │ █████████████████████████▏ │
|
||||
│ LONDON │ HOUNSLOW │ 618 │ 1215682 │ ████████████████████████▎ │
|
||||
│ ASCOT │ WINDSOR AND MAIDENHEAD │ 379 │ 1215146 │ ████████████████████████▎ │
|
||||
│ LONDON │ RICHMOND UPON THAMES │ 654 │ 1207551 │ ████████████████████████▏ │
|
||||
│ BEACONSFIELD │ BUCKINGHAMSHIRE │ 307 │ 1186220 │ ███████████████████████▋ │
|
||||
│ RICHMOND │ RICHMOND UPON THAMES │ 805 │ 1100420 │ ██████████████████████ │
|
||||
│ LONDON │ HAMMERSMITH AND FULHAM │ 2888 │ 1062959 │ █████████████████████▎ │
|
||||
│ WEYBRIDGE │ ELMBRIDGE │ 607 │ 1027161 │ ████████████████████▌ │
|
||||
│ RADLETT │ HERTSMERE │ 265 │ 1015896 │ ████████████████████▎ │
|
||||
│ SALCOMBE │ SOUTH HAMS │ 124 │ 1014393 │ ████████████████████▎ │
|
||||
│ BURFORD │ WEST OXFORDSHIRE │ 102 │ 993100 │ ███████████████████▋ │
|
||||
│ ESHER │ ELMBRIDGE │ 454 │ 969770 │ ███████████████████▍ │
|
||||
│ HINDHEAD │ WAVERLEY │ 128 │ 967786 │ ███████████████████▎ │
|
||||
│ BROCKENHURST │ NEW FOREST │ 121 │ 967046 │ ███████████████████▎ │
|
||||
│ LEATHERHEAD │ GUILDFORD │ 191 │ 964489 │ ███████████████████▎ │
|
||||
│ GERRARDS CROSS │ BUCKINGHAMSHIRE │ 376 │ 958555 │ ███████████████████▏ │
|
||||
│ EAST MOLESEY │ ELMBRIDGE │ 181 │ 943457 │ ██████████████████▋ │
|
||||
│ OLNEY │ MILTON KEYNES │ 220 │ 942892 │ ██████████████████▋ │
|
||||
│ CHALFONT ST GILES │ BUCKINGHAMSHIRE │ 135 │ 926950 │ ██████████████████▌ │
|
||||
│ HENLEY-ON-THAMES │ SOUTH OXFORDSHIRE │ 509 │ 905732 │ ██████████████████ │
|
||||
│ KINGSTON UPON THAMES │ KINGSTON UPON THAMES │ 889 │ 899689 │ █████████████████▊ │
|
||||
│ BELVEDERE │ BEXLEY │ 313 │ 895336 │ █████████████████▊ │
|
||||
│ CRANBROOK │ TUNBRIDGE WELLS │ 404 │ 888190 │ █████████████████▋ │
|
||||
│ LONDON │ EALING │ 2460 │ 865893 │ █████████████████▎ │
|
||||
│ MAIDENHEAD │ BUCKINGHAMSHIRE │ 114 │ 863814 │ █████████████████▎ │
|
||||
│ LONDON │ MERTON │ 1958 │ 857192 │ █████████████████▏ │
|
||||
│ GUILDFORD │ WAVERLEY │ 131 │ 854447 │ █████████████████ │
|
||||
│ LONDON │ HACKNEY │ 3088 │ 846571 │ ████████████████▊ │
|
||||
│ LYMM │ WARRINGTON │ 285 │ 839920 │ ████████████████▋ │
|
||||
│ HARPENDEN │ ST ALBANS │ 606 │ 836994 │ ████████████████▋ │
|
||||
│ LONDON │ WANDSWORTH │ 6113 │ 832292 │ ████████████████▋ │
|
||||
│ LONDON │ SOUTHWARK │ 3612 │ 831319 │ ████████████████▋ │
|
||||
│ BERKHAMSTED │ DACORUM │ 502 │ 830356 │ ████████████████▌ │
|
||||
│ KINGS LANGLEY │ DACORUM │ 137 │ 821358 │ ████████████████▍ │
|
||||
│ TONBRIDGE │ TUNBRIDGE WELLS │ 339 │ 806736 │ ████████████████▏ │
|
||||
│ EPSOM │ REIGATE AND BANSTEAD │ 157 │ 805903 │ ████████████████ │
|
||||
│ WOKING │ GUILDFORD │ 161 │ 803283 │ ████████████████ │
|
||||
│ STOCKBRIDGE │ TEST VALLEY │ 168 │ 801973 │ ████████████████ │
|
||||
│ TEDDINGTON │ RICHMOND UPON THAMES │ 539 │ 798591 │ ███████████████▊ │
|
||||
│ OXFORD │ VALE OF WHITE HORSE │ 329 │ 792907 │ ███████████████▋ │
|
||||
│ LONDON │ BARNET │ 3624 │ 789583 │ ███████████████▋ │
|
||||
│ TWICKENHAM │ RICHMOND UPON THAMES │ 1090 │ 787760 │ ███████████████▋ │
|
||||
│ LUTON │ CENTRAL BEDFORDSHIRE │ 196 │ 786051 │ ███████████████▋ │
|
||||
│ TONBRIDGE │ MAIDSTONE │ 277 │ 785746 │ ███████████████▋ │
|
||||
│ TOWCESTER │ WEST NORTHAMPTONSHIRE │ 186 │ 783532 │ ███████████████▋ │
|
||||
│ LONDON │ LAMBETH │ 4832 │ 783422 │ ███████████████▋ │
|
||||
│ LUTTERWORTH │ HARBOROUGH │ 515 │ 781775 │ ███████████████▋ │
|
||||
│ WOODSTOCK │ WEST OXFORDSHIRE │ 135 │ 777499 │ ███████████████▌ │
|
||||
│ ALRESFORD │ WINCHESTER │ 196 │ 775577 │ ███████████████▌ │
|
||||
│ LONDON │ NEWHAM │ 2942 │ 768551 │ ███████████████▎ │
|
||||
│ ALDERLEY EDGE │ CHESHIRE EAST │ 168 │ 768280 │ ███████████████▎ │
|
||||
│ MARLOW │ BUCKINGHAMSHIRE │ 301 │ 762784 │ ███████████████▎ │
|
||||
│ BILLINGSHURST │ CHICHESTER │ 134 │ 760920 │ ███████████████▏ │
|
||||
│ LONDON │ TOWER HAMLETS │ 4183 │ 759635 │ ███████████████▏ │
|
||||
│ MIDHURST │ CHICHESTER │ 245 │ 759101 │ ███████████████▏ │
|
||||
│ THAMES DITTON │ ELMBRIDGE │ 227 │ 753347 │ ███████████████ │
|
||||
│ POTTERS BAR │ WELWYN HATFIELD │ 163 │ 752926 │ ███████████████ │
|
||||
│ REIGATE │ REIGATE AND BANSTEAD │ 555 │ 740961 │ ██████████████▋ │
|
||||
│ TADWORTH │ REIGATE AND BANSTEAD │ 477 │ 738997 │ ██████████████▋ │
|
||||
│ SEVENOAKS │ SEVENOAKS │ 1074 │ 734658 │ ██████████████▋ │
|
||||
│ PETWORTH │ CHICHESTER │ 138 │ 732432 │ ██████████████▋ │
|
||||
│ BOURNE END │ BUCKINGHAMSHIRE │ 127 │ 730742 │ ██████████████▌ │
|
||||
│ PURLEY │ CROYDON │ 540 │ 727721 │ ██████████████▌ │
|
||||
│ OXTED │ TANDRIDGE │ 320 │ 726078 │ ██████████████▌ │
|
||||
│ LONDON │ HARINGEY │ 2988 │ 724573 │ ██████████████▍ │
|
||||
│ BANSTEAD │ REIGATE AND BANSTEAD │ 373 │ 713834 │ ██████████████▎ │
|
||||
│ PINNER │ HARROW │ 480 │ 712166 │ ██████████████▏ │
|
||||
│ MALMESBURY │ WILTSHIRE │ 293 │ 707747 │ ██████████████▏ │
|
||||
│ RICKMANSWORTH │ THREE RIVERS │ 732 │ 705400 │ ██████████████ │
|
||||
│ SLOUGH │ BUCKINGHAMSHIRE │ 359 │ 705002 │ ██████████████ │
|
||||
│ GREAT MISSENDEN │ BUCKINGHAMSHIRE │ 214 │ 704904 │ ██████████████ │
|
||||
│ READING │ SOUTH OXFORDSHIRE │ 295 │ 701697 │ ██████████████ │
|
||||
│ HYTHE │ FOLKESTONE AND HYTHE │ 457 │ 700334 │ ██████████████ │
|
||||
│ WELWYN │ WELWYN HATFIELD │ 217 │ 699649 │ █████████████▊ │
|
||||
│ CHIGWELL │ EPPING FOREST │ 242 │ 697869 │ █████████████▊ │
|
||||
│ BARNET │ BARNET │ 906 │ 695680 │ █████████████▊ │
|
||||
│ HASLEMERE │ CHICHESTER │ 120 │ 694028 │ █████████████▊ │
|
||||
│ LEATHERHEAD │ MOLE VALLEY │ 748 │ 692026 │ █████████████▋ │
|
||||
│ LONDON │ BRENT │ 1945 │ 690799 │ █████████████▋ │
|
||||
│ HASLEMERE │ WAVERLEY │ 258 │ 690765 │ █████████████▋ │
|
||||
│ NORTHWOOD │ HILLINGDON │ 252 │ 690753 │ █████████████▋ │
|
||||
│ WALTON-ON-THAMES │ ELMBRIDGE │ 871 │ 689431 │ █████████████▋ │
|
||||
│ INGATESTONE │ BRENTWOOD │ 150 │ 688345 │ █████████████▋ │
|
||||
│ OXFORD │ OXFORD │ 1761 │ 686114 │ █████████████▋ │
|
||||
│ CHISLEHURST │ BROMLEY │ 410 │ 682892 │ █████████████▋ │
|
||||
│ KINGS LANGLEY │ THREE RIVERS │ 109 │ 682320 │ █████████████▋ │
|
||||
│ ASHTEAD │ MOLE VALLEY │ 280 │ 680483 │ █████████████▌ │
|
||||
│ WOKING │ SURREY HEATH │ 269 │ 679035 │ █████████████▌ │
|
||||
│ ASCOT │ BRACKNELL FOREST │ 160 │ 678632 │ █████████████▌ │
|
||||
└──────────────────────┴────────────────────────┴──────┴─────────┴────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
100 rows in set. Elapsed: 0.005 sec. Processed 12.85 thousand rows, 813.40 KB (2.73 million rows/s., 172.95 MB/s.)
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
```text
|
||||
┌─town─────────────────┬─district───────────────┬────c─┬───price─┬─bar(round(avg(price)), 0, 5000000, 100)────────────────────────────┐
|
||||
│ LONDON │ CITY OF WESTMINSTER │ 3606 │ 3280239 │ █████████████████████████████████████████████████████████████████▌ │
|
||||
│ LONDON │ CITY OF LONDON │ 274 │ 3160502 │ ███████████████████████████████████████████████████████████████▏ │
|
||||
│ LONDON │ KENSINGTON AND CHELSEA │ 2550 │ 2308478 │ ██████████████████████████████████████████████▏ │
|
||||
│ LEATHERHEAD │ ELMBRIDGE │ 114 │ 1897407 │ █████████████████████████████████████▊ │
|
||||
│ LONDON │ CAMDEN │ 3033 │ 1805404 │ ████████████████████████████████████ │
|
||||
│ VIRGINIA WATER │ RUNNYMEDE │ 156 │ 1753247 │ ███████████████████████████████████ │
|
||||
│ WINDLESHAM │ SURREY HEATH │ 108 │ 1677613 │ █████████████████████████████████▌ │
|
||||
│ THORNTON HEATH │ CROYDON │ 546 │ 1671721 │ █████████████████████████████████▍ │
|
||||
│ BARNET │ ENFIELD │ 124 │ 1505840 │ ██████████████████████████████ │
|
||||
│ COBHAM │ ELMBRIDGE │ 387 │ 1237250 │ ████████████████████████▋ │
|
||||
│ LONDON │ ISLINGTON │ 2668 │ 1236980 │ ████████████████████████▋ │
|
||||
│ OXFORD │ SOUTH OXFORDSHIRE │ 321 │ 1220907 │ ████████████████████████▍ │
|
||||
│ LONDON │ RICHMOND UPON THAMES │ 704 │ 1215551 │ ████████████████████████▎ │
|
||||
│ LONDON │ HOUNSLOW │ 671 │ 1207493 │ ████████████████████████▏ │
|
||||
│ ASCOT │ WINDSOR AND MAIDENHEAD │ 407 │ 1183299 │ ███████████████████████▋ │
|
||||
│ BEACONSFIELD │ BUCKINGHAMSHIRE │ 330 │ 1175615 │ ███████████████████████▌ │
|
||||
│ RICHMOND │ RICHMOND UPON THAMES │ 874 │ 1110444 │ ██████████████████████▏ │
|
||||
│ LONDON │ HAMMERSMITH AND FULHAM │ 3086 │ 1053983 │ █████████████████████ │
|
||||
│ SURBITON │ ELMBRIDGE │ 100 │ 1011800 │ ████████████████████▏ │
|
||||
│ RADLETT │ HERTSMERE │ 283 │ 1011712 │ ████████████████████▏ │
|
||||
│ SALCOMBE │ SOUTH HAMS │ 127 │ 1011624 │ ████████████████████▏ │
|
||||
│ WEYBRIDGE │ ELMBRIDGE │ 655 │ 1007265 │ ████████████████████▏ │
|
||||
│ ESHER │ ELMBRIDGE │ 485 │ 986581 │ ███████████████████▋ │
|
||||
│ LEATHERHEAD │ GUILDFORD │ 202 │ 977320 │ ███████████████████▌ │
|
||||
│ BURFORD │ WEST OXFORDSHIRE │ 111 │ 966893 │ ███████████████████▎ │
|
||||
│ BROCKENHURST │ NEW FOREST │ 129 │ 956675 │ ███████████████████▏ │
|
||||
│ HINDHEAD │ WAVERLEY │ 137 │ 953753 │ ███████████████████ │
|
||||
│ GERRARDS CROSS │ BUCKINGHAMSHIRE │ 419 │ 951121 │ ███████████████████ │
|
||||
│ EAST MOLESEY │ ELMBRIDGE │ 192 │ 936769 │ ██████████████████▋ │
|
||||
│ CHALFONT ST GILES │ BUCKINGHAMSHIRE │ 146 │ 925515 │ ██████████████████▌ │
|
||||
│ LONDON │ TOWER HAMLETS │ 4388 │ 918304 │ ██████████████████▎ │
|
||||
│ OLNEY │ MILTON KEYNES │ 235 │ 910646 │ ██████████████████▏ │
|
||||
│ HENLEY-ON-THAMES │ SOUTH OXFORDSHIRE │ 540 │ 902418 │ ██████████████████ │
|
||||
│ LONDON │ SOUTHWARK │ 3885 │ 892997 │ █████████████████▋ │
|
||||
│ KINGSTON UPON THAMES │ KINGSTON UPON THAMES │ 960 │ 885969 │ █████████████████▋ │
|
||||
│ LONDON │ EALING │ 2658 │ 871755 │ █████████████████▍ │
|
||||
│ CRANBROOK │ TUNBRIDGE WELLS │ 431 │ 862348 │ █████████████████▏ │
|
||||
│ LONDON │ MERTON │ 2099 │ 859118 │ █████████████████▏ │
|
||||
│ BELVEDERE │ BEXLEY │ 346 │ 842423 │ ████████████████▋ │
|
||||
│ GUILDFORD │ WAVERLEY │ 143 │ 841277 │ ████████████████▋ │
|
||||
│ HARPENDEN │ ST ALBANS │ 657 │ 841216 │ ████████████████▋ │
|
||||
│ LONDON │ HACKNEY │ 3307 │ 837090 │ ████████████████▋ │
|
||||
│ LONDON │ WANDSWORTH │ 6566 │ 832663 │ ████████████████▋ │
|
||||
│ MAIDENHEAD │ BUCKINGHAMSHIRE │ 123 │ 824299 │ ████████████████▍ │
|
||||
│ KINGS LANGLEY │ DACORUM │ 145 │ 821331 │ ████████████████▍ │
|
||||
│ BERKHAMSTED │ DACORUM │ 543 │ 818415 │ ████████████████▎ │
|
||||
│ GREAT MISSENDEN │ BUCKINGHAMSHIRE │ 226 │ 802807 │ ████████████████ │
|
||||
│ BILLINGSHURST │ CHICHESTER │ 144 │ 797829 │ ███████████████▊ │
|
||||
│ WOKING │ GUILDFORD │ 176 │ 793494 │ ███████████████▋ │
|
||||
│ STOCKBRIDGE │ TEST VALLEY │ 178 │ 793269 │ ███████████████▋ │
|
||||
│ EPSOM │ REIGATE AND BANSTEAD │ 172 │ 791862 │ ███████████████▋ │
|
||||
│ TONBRIDGE │ TUNBRIDGE WELLS │ 360 │ 787876 │ ███████████████▋ │
|
||||
│ TEDDINGTON │ RICHMOND UPON THAMES │ 595 │ 786492 │ ███████████████▋ │
|
||||
│ TWICKENHAM │ RICHMOND UPON THAMES │ 1155 │ 786193 │ ███████████████▋ │
|
||||
│ LYNDHURST │ NEW FOREST │ 102 │ 785593 │ ███████████████▋ │
|
||||
│ LONDON │ LAMBETH │ 5228 │ 774574 │ ███████████████▍ │
|
||||
│ LONDON │ BARNET │ 3955 │ 773259 │ ███████████████▍ │
|
||||
│ OXFORD │ VALE OF WHITE HORSE │ 353 │ 772088 │ ███████████████▍ │
|
||||
│ TONBRIDGE │ MAIDSTONE │ 305 │ 770740 │ ███████████████▍ │
|
||||
│ LUTTERWORTH │ HARBOROUGH │ 538 │ 768634 │ ███████████████▎ │
|
||||
│ WOODSTOCK │ WEST OXFORDSHIRE │ 140 │ 766037 │ ███████████████▎ │
|
||||
│ MIDHURST │ CHICHESTER │ 257 │ 764815 │ ███████████████▎ │
|
||||
│ MARLOW │ BUCKINGHAMSHIRE │ 327 │ 761876 │ ███████████████▏ │
|
||||
│ LONDON │ NEWHAM │ 3237 │ 761784 │ ███████████████▏ │
|
||||
│ ALDERLEY EDGE │ CHESHIRE EAST │ 178 │ 757318 │ ███████████████▏ │
|
||||
│ LUTON │ CENTRAL BEDFORDSHIRE │ 212 │ 754283 │ ███████████████ │
|
||||
│ PETWORTH │ CHICHESTER │ 154 │ 754220 │ ███████████████ │
|
||||
│ ALRESFORD │ WINCHESTER │ 219 │ 752718 │ ███████████████ │
|
||||
│ POTTERS BAR │ WELWYN HATFIELD │ 174 │ 748465 │ ██████████████▊ │
|
||||
│ HASLEMERE │ CHICHESTER │ 128 │ 746907 │ ██████████████▊ │
|
||||
│ TADWORTH │ REIGATE AND BANSTEAD │ 502 │ 743252 │ ██████████████▋ │
|
||||
│ THAMES DITTON │ ELMBRIDGE │ 244 │ 741913 │ ██████████████▋ │
|
||||
│ REIGATE │ REIGATE AND BANSTEAD │ 581 │ 738198 │ ██████████████▋ │
|
||||
│ BOURNE END │ BUCKINGHAMSHIRE │ 138 │ 735190 │ ██████████████▋ │
|
||||
│ SEVENOAKS │ SEVENOAKS │ 1156 │ 730018 │ ██████████████▌ │
|
||||
│ OXTED │ TANDRIDGE │ 336 │ 729123 │ ██████████████▌ │
|
||||
│ INGATESTONE │ BRENTWOOD │ 166 │ 728103 │ ██████████████▌ │
|
||||
│ LONDON │ BRENT │ 2079 │ 720605 │ ██████████████▍ │
|
||||
│ LONDON │ HARINGEY │ 3216 │ 717780 │ ██████████████▎ │
|
||||
│ PURLEY │ CROYDON │ 575 │ 716108 │ ██████████████▎ │
|
||||
│ WELWYN │ WELWYN HATFIELD │ 222 │ 710603 │ ██████████████▏ │
|
||||
│ RICKMANSWORTH │ THREE RIVERS │ 798 │ 704571 │ ██████████████ │
|
||||
│ BANSTEAD │ REIGATE AND BANSTEAD │ 401 │ 701293 │ ██████████████ │
|
||||
│ CHIGWELL │ EPPING FOREST │ 261 │ 701203 │ ██████████████ │
|
||||
│ PINNER │ HARROW │ 528 │ 698885 │ █████████████▊ │
|
||||
│ HASLEMERE │ WAVERLEY │ 280 │ 696659 │ █████████████▊ │
|
||||
│ SLOUGH │ BUCKINGHAMSHIRE │ 396 │ 694917 │ █████████████▊ │
|
||||
│ WALTON-ON-THAMES │ ELMBRIDGE │ 946 │ 692395 │ █████████████▋ │
|
||||
│ READING │ SOUTH OXFORDSHIRE │ 318 │ 691988 │ █████████████▋ │
|
||||
│ NORTHWOOD │ HILLINGDON │ 271 │ 690643 │ █████████████▋ │
|
||||
│ FELTHAM │ HOUNSLOW │ 763 │ 688595 │ █████████████▋ │
|
||||
│ ASHTEAD │ MOLE VALLEY │ 303 │ 687923 │ █████████████▋ │
|
||||
│ BARNET │ BARNET │ 975 │ 686980 │ █████████████▋ │
|
||||
│ WOKING │ SURREY HEATH │ 283 │ 686669 │ █████████████▋ │
|
||||
│ MALMESBURY │ WILTSHIRE │ 323 │ 683324 │ █████████████▋ │
|
||||
│ AMERSHAM │ BUCKINGHAMSHIRE │ 496 │ 680962 │ █████████████▌ │
|
||||
│ CHISLEHURST │ BROMLEY │ 430 │ 680209 │ █████████████▌ │
|
||||
│ HYTHE │ FOLKESTONE AND HYTHE │ 490 │ 676908 │ █████████████▌ │
|
||||
│ MAYFIELD │ WEALDEN │ 101 │ 676210 │ █████████████▌ │
|
||||
│ ASCOT │ BRACKNELL FOREST │ 168 │ 676004 │ █████████████▌ │
|
||||
└──────────────────────┴────────────────────────┴──────┴─────────┴────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Summary {#summary}
|
||||
|
||||
All 3 queries work much faster and read fewer rows.
|
||||
|
||||
```text
|
||||
Query 1
|
||||
|
||||
no projection: 27 rows in set. Elapsed: 0.158 sec. Processed 26.32 million rows, 157.93 MB (166.57 million rows/s., 999.39 MB/s.)
|
||||
projection: 27 rows in set. Elapsed: 0.007 sec. Processed 105.96 thousand rows, 3.33 MB (14.58 million rows/s., 458.13 MB/s.)
|
||||
|
||||
|
||||
Query 2
|
||||
|
||||
no projection: 27 rows in set. Elapsed: 0.163 sec. Processed 26.32 million rows, 80.01 MB (161.75 million rows/s., 491.64 MB/s.)
|
||||
projection: 27 rows in set. Elapsed: 0.008 sec. Processed 105.96 thousand rows, 3.67 MB (13.29 million rows/s., 459.89 MB/s.)
|
||||
|
||||
Query 3
|
||||
|
||||
no projection: 100 rows in set. Elapsed: 0.069 sec. Processed 26.32 million rows, 62.47 MB (382.13 million rows/s., 906.93 MB/s.)
|
||||
projection: 100 rows in set. Elapsed: 0.029 sec. Processed 8.08 thousand rows, 511.08 KB (276.06 thousand rows/s., 17.47 MB/s.)
|
||||
```
|
||||
Q1)
|
||||
no projection: 27 rows in set. Elapsed: 0.027 sec. Processed 26.25 million rows, 157.49 MB (955.96 million rows/s., 5.74 GB/s.)
|
||||
projection: 27 rows in set. Elapsed: 0.003 sec. Processed 106.87 thousand rows, 3.21 MB (31.92 million rows/s., 959.03 MB/s.)
|
||||
```
|
||||
|
||||
### Test It in Playground {#playground}
|
||||
|
||||
The dataset is also available in the [Online Playground](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIHRvd24sIGRpc3RyaWN0LCBjb3VudCgpIEFTIGMsIHJvdW5kKGF2ZyhwcmljZSkpIEFTIHByaWNlLCBiYXIocHJpY2UsIDAsIDUwMDAwMDAsIDEwMCkgRlJPTSB1a19wcmljZV9wYWlkIFdIRVJFIGRhdGUgPj0gJzIwMjAtMDEtMDEnIEdST1VQIEJZIHRvd24sIGRpc3RyaWN0IEhBVklORyBjID49IDEwMCBPUkRFUiBCWSBwcmljZSBERVNDIExJTUlUIDEwMA==).
|
||||
|
@ -29,7 +29,7 @@ It is recommended to use official pre-compiled `deb` packages for Debian or Ubun
|
||||
|
||||
If you want to use the most recent version, replace `stable` with `testing` (this is recommended for your testing environments).
|
||||
|
||||
You can also download and install packages manually from [here](https://repo.clickhouse.tech/deb/stable/main/).
|
||||
You can also download and install packages manually from [here](https://repo.clickhouse.com/deb/stable/main/).
|
||||
|
||||
#### Packages {#packages}
|
||||
|
||||
@ -50,8 +50,8 @@ First, you need to add the official repository:
|
||||
|
||||
``` bash
|
||||
sudo yum install yum-utils
|
||||
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
|
||||
sudo rpm --import https://repo.clickhouse.com/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.com/rpm/stable/x86_64
|
||||
```
|
||||
|
||||
If you want to use the most recent version, replace `stable` with `testing` (this is recommended for your testing environments). `prestable` is sometimes also available.
|
||||
@ -62,21 +62,21 @@ Then run these commands to install packages:
|
||||
sudo yum install clickhouse-server clickhouse-client
|
||||
```
|
||||
|
||||
You can also download and install packages manually from [here](https://repo.clickhouse.tech/rpm/stable/x86_64).
|
||||
You can also download and install packages manually from [here](https://repo.clickhouse.com/rpm/stable/x86_64).
|
||||
|
||||
### From Tgz Archives {#from-tgz-archives}
|
||||
|
||||
It is recommended to use official pre-compiled `tgz` archives for all Linux distributions, where installation of `deb` or `rpm` packages is not possible.
|
||||
|
||||
The required version can be downloaded with `curl` or `wget` from repository https://repo.clickhouse.tech/tgz/.
|
||||
The required version can be downloaded with `curl` or `wget` from repository https://repo.clickhouse.com/tgz/.
|
||||
After that downloaded archives should be unpacked and installed with installation scripts. Example for the latest version:
|
||||
|
||||
``` bash
|
||||
export LATEST_VERSION=`curl https://api.github.com/repos/ClickHouse/ClickHouse/tags 2>/dev/null | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -n 1`
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
|
||||
tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh
|
||||
|
@ -386,7 +386,7 @@ The CSV format supports the output of totals and extremes the same way as `TabSe
|
||||
|
||||
## CSVWithNames {#csvwithnames}
|
||||
|
||||
Also prints the header row, similar to `TabSeparatedWithNames`.
|
||||
Also prints the header row, similar to [TabSeparatedWithNames](#tabseparatedwithnames).
|
||||
|
||||
## CustomSeparated {#format-customseparated}
|
||||
|
||||
|
@ -56,6 +56,7 @@ toc_title: Adopters
|
||||
| <a href="https://geniee.co.jp" class="favicon">Geniee</a> | Ad network | Main product | — | — | [Blog post in Japanese, July 2017](https://tech.geniee.co.jp/entry/2017/07/20/160100) |
|
||||
| <a href="https://www.genotek.ru/" class="favicon">Genotek</a> | Bioinformatics | Main product | — | — | [Video, August 2020](https://youtu.be/v3KyZbz9lEE) |
|
||||
| <a href="https://glaber.io/" class="favicon">Glaber</a> | Monitoring | Main product | — | — | [Website](https://glaber.io/) |
|
||||
| <a href="https://graphcdn.io/" class="favicon">GraphCDN</a> | CDN | Traffic Analytics | — | — | [Blog Post in English, August 2021](https://altinity.com/blog/delivering-insight-on-graphql-apis-with-clickhouse-at-graphcdn/) |
|
||||
| <a href="https://www.huya.com/" class="favicon">HUYA</a> | Video Streaming | Analytics | — | — | [Slides in Chinese, October 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup19/7.%20ClickHouse万亿数据分析实践%20李本旺(sundy-li)%20虎牙.pdf) |
|
||||
| <a href="https://www.the-ica.com/" class="favicon">ICA</a> | FinTech | Risk Management | — | — | [Blog Post in English, Sep 2020](https://altinity.com/blog/clickhouse-vs-redshift-performance-for-fintech-risk-management?utm_campaign=ClickHouse%20vs%20RedShift&utm_content=143520807&utm_medium=social&utm_source=twitter&hss_channel=tw-3894792263) |
|
||||
| <a href="https://www.idealista.com" class="favicon">Idealista</a> | Real Estate | Analytics | — | — | [Blog Post in English, April 2019](https://clickhouse.com/blog/en/clickhouse-meetup-in-madrid-on-april-2-2019) |
|
||||
|
@ -45,7 +45,7 @@ Configuration template:
|
||||
- `min_part_size` – The minimum size of a data part.
|
||||
- `min_part_size_ratio` – The ratio of the data part size to the table size.
|
||||
- `method` – Compression method. Acceptable values: `lz4`, `lz4hc`, `zstd`.
|
||||
- `level` – Compression level. See [Codecs](../../sql-reference/statements/create/table/#create-query-general-purpose-codecs).
|
||||
- `level` – Compression level. See [Codecs](../../sql-reference/statements/create/table.md#create-query-general-purpose-codecs).
|
||||
|
||||
You can configure multiple `<case>` sections.
|
||||
|
||||
|
@ -93,6 +93,17 @@ Works with tables in the MergeTree family.
|
||||
|
||||
If `force_primary_key=1`, ClickHouse checks to see if the query has a primary key condition that can be used for restricting data ranges. If there is no suitable condition, it throws an exception. However, it does not check whether the condition reduces the amount of data to read. For more information about data ranges in MergeTree tables, see [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md).
|
||||
|
||||
## use_skip_indexes {#settings-use_skip_indexes}
|
||||
|
||||
Use data skipping indexes during query execution.
|
||||
|
||||
Possible values:
|
||||
|
||||
- 0 — Disabled.
|
||||
- 1 — Enabled.
|
||||
|
||||
Default value: 1.
|
||||
|
||||
## force_data_skipping_indices {#settings-force_data_skipping_indices}
|
||||
|
||||
Disables query execution if passed data skipping indices wasn't used.
|
||||
|
@ -7,7 +7,8 @@ toc_title: DateTime64
|
||||
|
||||
Allows to store an instant in time, that can be expressed as a calendar date and a time of a day, with defined sub-second precision
|
||||
|
||||
Tick size (precision): 10<sup>-precision</sup> seconds
|
||||
Tick size (precision): 10<sup>-precision</sup> seconds. Valid range: [ 0 : 9 ].
|
||||
Typically are used - 3 (milliseconds), 6 (microseconds), 9 (nanoseconds).
|
||||
|
||||
**Syntax:**
|
||||
|
||||
|
@ -594,14 +594,14 @@ Result:
|
||||
└─────┘
|
||||
```
|
||||
|
||||
## h3ResIsClassIII {#h3resisclassIII}
|
||||
## h3IsResClassIII {#h3isresclassIII}
|
||||
|
||||
Returns whether [H3](#h3index) index has a resolution with Class III orientation.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
h3ResIsClassIII(index)
|
||||
h3IsResClassIII(index)
|
||||
```
|
||||
|
||||
**Parameter**
|
||||
@ -620,7 +620,7 @@ Type: [UInt8](../../../sql-reference/data-types/int-uint.md).
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT h3ResIsClassIII(617420388352917503) as res;
|
||||
SELECT h3IsResClassIII(617420388352917503) as res;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
386
docs/en/sql-reference/functions/geo/s2.md
Normal file
386
docs/en/sql-reference/functions/geo/s2.md
Normal file
@ -0,0 +1,386 @@
|
||||
---
|
||||
toc_title: S2 Geometry
|
||||
---
|
||||
|
||||
# Functions for Working with S2 Index {#s2Index}
|
||||
|
||||
[S2](https://s2geometry.io/) is a geographical indexing system where all geographical data is represented on a three-dimensional sphere (similar to a globe).
|
||||
|
||||
In the S2 library points are represented as unit length vectors called S2 point indices (points on the surface of a three dimensional unit sphere) as opposed to traditional (latitude, longitude) pairs.
|
||||
|
||||
## geoToS2 {#geoToS2}
|
||||
|
||||
Returns [S2](#s2index) point index corresponding to the provided coordinates `(longitude, latitude)`.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
geoToS2(lon, lat)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `lon` — Longitude. [Float64](../../../sql-reference/data-types/float.md).
|
||||
- `lat` — Latitude. [Float64](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- S2 point index.
|
||||
|
||||
Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT geoToS2(37.79506683, 55.71290588) as s2Index;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─────────────s2Index─┐
|
||||
│ 4704772434919038107 │
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
## s2ToGeo {#s2ToGeo}
|
||||
|
||||
Returns geo coordinates `(longitude, latitude)` corresponding to the provided [S2](#s2index) point index.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2ToGeo(s2index)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `s2Index` — S2 Index. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- A tuple consisting of two values: `tuple(lon,lat)`.
|
||||
|
||||
Type: `lon` - [Float64](../../../sql-reference/data-types/float.md). `lat` — [Float64](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT s2ToGeo(4704772434919038107) as s2Coodrinates;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─s2Coodrinates────────────────────────┐
|
||||
│ (37.79506681471008,55.7129059052841) │
|
||||
└──────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## s2GetNeighbors {#s2GetNeighbors}
|
||||
|
||||
Returns S2 neighbor indices corresponding to the provided [S2](#s2index)). Each cell in the S2 system is a quadrilateral bounded by four geodesics. So, each cell has 4 neighbors.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2GetNeighbors(s2index)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `s2index` — S2 Index. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- An array consisting of the 4 neighbor indices: `array[s2index1, s2index3, s2index2, s2index4]`.
|
||||
|
||||
Type: Each S2 index is [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
select s2GetNeighbors(5074766849661468672) AS s2Neighbors;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─s2Neighbors───────────────────────────────────────────────────────────────────────┐
|
||||
│ [5074766987100422144,5074766712222515200,5074767536856236032,5074767261978329088] │
|
||||
└───────────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## s2CellsIntersect {#s2CellsIntersect}
|
||||
|
||||
Determines if the two provided [S2](#s2index)) cell indices intersect or not.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2CellsIntersect(s2index1, s2index2)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `siIndex1`, `s2index2` — S2 Index. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- 1 — If the S2 cell indices intersect.
|
||||
- 0 — If the S2 cell indices don't intersect.
|
||||
|
||||
Type: [UInt8](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
select s2CellsIntersect(9926595209846587392, 9926594385212866560) as intersect;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─intersect─┐
|
||||
│ 1 │
|
||||
└───────────┘
|
||||
```
|
||||
|
||||
## s2CapContains {#s2CapContains}
|
||||
|
||||
A cap represents a portion of the sphere that has been cut off by a plane. It is defined by a point on a sphere and a radius in degrees.
|
||||
|
||||
Determines if a cap contains a s2 point index.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2CapContains(center, degrees, point)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `center` - S2 point index corresponding to the cap. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `degrees` - Radius of the cap in degrees. [Float64](../../../sql-reference/data-types/float.md).
|
||||
- `point` - S2 point index. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- 1 — If the cap contains the S2 point index.
|
||||
- 0 — If the cap doesn't contain the S2 point index.
|
||||
|
||||
Type: [UInt8](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
select s2CapContains(1157339245694594829, 1.0, 1157347770437378819) as capContains;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─capContains─┐
|
||||
│ 1 │
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
## s2CapUnion {#s2CapUnion}
|
||||
|
||||
A cap represents a portion of the sphere that has been cut off by a plane. It is defined by a point on a sphere and a radius in degrees.
|
||||
|
||||
Determines the smallest cap that contains the given two input caps.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2CapUnion(center1, radius1, center2, radius2)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `center1`, `center2` - S2 point indices corresponding to the two input caps. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `radius1`, `radius2` - Radii of the two input caps in degrees. [Float64](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- `center` - S2 point index corresponding the center of the smallest cap containing the two input caps. Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `radius` - Radius of the smallest cap containing the two input caps. Type: [Float64](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT s2CapUnion(3814912406305146967, 1.0, 1157347770437378819, 1.0) AS capUnion;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─capUnion───────────────────────────────┐
|
||||
│ (4534655147792050737,60.2088283994957) │
|
||||
└────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## s2RectAdd{#s2RectAdd}
|
||||
|
||||
In the S2 system, a rectangle is represented by a type of S2Region called a S2LatLngRect that represents a rectangle in latitude-longitude space.
|
||||
|
||||
Increases the size of the bounding rectangle to include the given S2 point index.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2RectAdd(s2pointLow, s2pointHigh, s2Point)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `s2PointLow` - Low S2 point index corresponding to the rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2PointHigh` - High S2 point index corresponding to the rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2Point` - Target S2 point index that the bound rectangle should be grown to include. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- `s2PointLow` - Low S2 cell id corresponding to the grown rectangle. Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2PointHigh` - Hight S2 cell id corresponding to the grown rectangle. Type: [UInt64](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT s2RectAdd(5178914411069187297, 5177056748191934217, 5179056748191934217) as rectAdd;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─rectAdd───────────────────────────────────┐
|
||||
│ (5179062030687166815,5177056748191934217) │
|
||||
└───────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## s2RectContains{#s2RectContains}
|
||||
|
||||
In the S2 system, a rectangle is represented by a type of S2Region called a S2LatLngRect that represents a rectangle in latitude-longitude space.
|
||||
|
||||
Determines if a given rectangle contains a S2 point index.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2RectContains(s2PointLow, s2PointHi, s2Point)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `s2PointLow` - Low S2 point index corresponding to the rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2PointHigh` - High S2 point index corresponding to the rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2Point` - Target S2 point index. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- 1 — If the rectangle contains the given S2 point.
|
||||
- 0 — If the rectangle doesn't contain the given S2 point.
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT s2RectContains(5179062030687166815, 5177056748191934217, 5177914411069187297) AS rectContains
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─rectContains─┐
|
||||
│ 0 │
|
||||
└──────────────┘
|
||||
```
|
||||
|
||||
## s2RectUinion{#s2RectUnion}
|
||||
|
||||
In the S2 system, a rectangle is represented by a type of S2Region called a S2LatLngRect that represents a rectangle in latitude-longitude space.
|
||||
|
||||
Returns the smallest rectangle containing the union of this rectangle and the given rectangle.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2RectUnion(s2Rect1PointLow, s2Rect1PointHi, s2Rect2PointLow, s2Rect2PointHi)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `s2Rect1PointLow`, `s2Rect1PointHi` - Low and High S2 point indices corresponding to the first rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2Rect2PointLow`, `s2Rect2PointHi` - Low and High S2 point indices corresponding to the second rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- `s2UnionRect2PointLow` - Low S2 cell id corresponding to the union rectangle. Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2UnionRect2PointHi` - High S2 cell id corresponding to the union rectangle. Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT s2RectUnion(5178914411069187297, 5177056748191934217, 5179062030687166815, 5177056748191934217) AS rectUnion
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─rectUnion─────────────────────────────────┐
|
||||
│ (5179062030687166815,5177056748191934217) │
|
||||
└───────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## s2RectIntersection{#s2RectIntersection}
|
||||
|
||||
Returns the smallest Rectangle containing the intersection of this rectangle and the given rectangle.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
s2RectIntersection(s2Rect1PointLow, s2Rect1PointHi, s2Rect2PointLow, s2Rect2PointHi)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `s2Rect1PointLow`, `s2Rect1PointHi` - Low and High S2 point indices corresponding to the first rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2Rect2PointLow`, `s2Rect2PointHi` - Low and High S2 point indices corresponding to the second rectangle. [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned values**
|
||||
|
||||
- `s2UnionRect2PointLow` - Low S2 cell id corresponding to the rectangle containing the intersection of the given rectangles. Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
- `s2UnionRect2PointHi` - Hi S2 cell id corresponding to the rectangle containing the intersection of the given rectangles. Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT s2RectIntersection(5178914411069187297, 5177056748191934217, 5179062030687166815, 5177056748191934217) AS rectIntersection
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─rectIntersection──────────────────────────┐
|
||||
│ (5178914411069187297,5177056748191934217) │
|
||||
└───────────────────────────────────────────┘
|
||||
```
|
@ -59,9 +59,68 @@ A lambda function that accepts multiple arguments can also be passed to a higher
|
||||
|
||||
For some functions the first argument (the lambda function) can be omitted. In this case, identical mapping is assumed.
|
||||
|
||||
## User Defined Functions {#user-defined-functions}
|
||||
## SQL User Defined Functions {#user-defined-functions}
|
||||
|
||||
Custom functions from lambda expressions can be created using the [CREATE FUNCTION](../statements/create/function.md) statement. To delete these functions use the [DROP FUNCTION](../statements/drop.md#drop-function) statement.
|
||||
|
||||
## Executable User Defined Functions {#executable-user-defined-functions}
|
||||
ClickHouse can call any external executable program or script to process data. Describe such functions in a [configuration file](../../operations/configuration-files.md) and add the path of that file to the main configuration in `user_defined_executable_functions_config` setting. If a wildcard symbol `*` is used in the path, then all files matching the pattern are loaded. Example:
|
||||
``` xml
|
||||
<user_defined_executable_functions_config>*_function.xml</user_defined_executable_functions_config>
|
||||
```
|
||||
User defined function configurations are searched relative to the path specified in the `user_files_path` setting.
|
||||
|
||||
A function configuration contains the following settings:
|
||||
|
||||
- `name` - a function name.
|
||||
- `command` - a command or a script to execute.
|
||||
- `argument` - argument description with the `type` of an argument. Each argument is described in a separate setting.
|
||||
- `format` - a [format](../../interfaces/formats.md) in which arguments are passed to the command.
|
||||
- `return_type` - the type of a returned value.
|
||||
- `type` - an executable type. If `type` is set to `executable` then single command is started. If it is set to `executable_pool` then a pool of commands is created.
|
||||
- `max_command_execution_time` - maximum execution time in seconds for processing block of data. This setting is valid for `executable_pool` commands only. Optional. Default value is `10`.
|
||||
- `command_termination_timeout` - time in seconds during which a command should finish after its pipe is closed. After that time `SIGTERM` is sent to the process executing the command. This setting is valid for `executable_pool` commands only. Optional. Default value is `10`.
|
||||
- `pool_size` - the size of a command pool. Optional. Default value is `16`.
|
||||
- `lifetime` - the reload interval of a function in seconds. If it is set to `0` then the function is not reloaded.
|
||||
- `send_chunk_header` - controls whether to send row count before sending a chunk of data to process. Optional. Default value is `false`.
|
||||
|
||||
The command must read arguments from `STDIN` and must output the result to `STDOUT`. The command must process arguments iteratively. That is after processing a chunk of arguments it must wait for the next chunk.
|
||||
|
||||
**Example**
|
||||
Creating `test_function` using XML configuration:
|
||||
```
|
||||
<functions>
|
||||
<function>
|
||||
<type>executable</type>
|
||||
<name>test_function</name>
|
||||
<return_type>UInt64</return_type>
|
||||
<argument>
|
||||
<type>UInt64</type>
|
||||
</argument>
|
||||
<argument>
|
||||
<type>UInt64</type>
|
||||
</argument>
|
||||
<format>TabSeparated</format>
|
||||
<command>cd /; clickhouse-local --input-format TabSeparated --output-format TabSeparated --structure 'x UInt64, y UInt64' --query "SELECT x + y FROM table"</command>
|
||||
<lifetime>0</lifetime>
|
||||
</function>
|
||||
</functions>
|
||||
```
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT test_function(toUInt64(2), toUInt64(2));
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─test_function(toUInt64(2), toUInt64(2))─┐
|
||||
│ 4 │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Custom functions can be created using the [CREATE FUNCTION](../statements/create/function.md) statement. To delete these functions use the [DROP FUNCTION](../statements/drop.md#drop-function) statement.
|
||||
|
||||
## Error Handling {#error-handling}
|
||||
|
||||
|
@ -155,6 +155,8 @@ Hierarchy of privileges:
|
||||
- `SYSTEM RELOAD CONFIG`
|
||||
- `SYSTEM RELOAD DICTIONARY`
|
||||
- `SYSTEM RELOAD EMBEDDED DICTIONARIES`
|
||||
- `SYSTEM RELOAD FUNCTION`
|
||||
- `SYSTEM RELOAD FUNCTIONS`
|
||||
- `SYSTEM MERGES`
|
||||
- `SYSTEM TTL MERGES`
|
||||
- `SYSTEM FETCHES`
|
||||
|
@ -12,6 +12,8 @@ The list of available `SYSTEM` statements:
|
||||
- [RELOAD DICTIONARY](#query_language-system-reload-dictionary)
|
||||
- [RELOAD MODELS](#query_language-system-reload-models)
|
||||
- [RELOAD MODEL](#query_language-system-reload-model)
|
||||
- [RELOAD FUNCTIONS](#query_language-system-reload-functions)
|
||||
- [RELOAD FUNCTION](#query_language-system-reload-functions)
|
||||
- [DROP DNS CACHE](#query_language-system-drop-dns-cache)
|
||||
- [DROP MARK CACHE](#query_language-system-drop-mark-cache)
|
||||
- [DROP UNCOMPRESSED CACHE](#query_language-system-drop-uncompressed-cache)
|
||||
@ -83,6 +85,17 @@ Completely reloads a CatBoost model `model_name` if the configuration was update
|
||||
SYSTEM RELOAD MODEL <model_name>
|
||||
```
|
||||
|
||||
## RELOAD FUNCTIONS {#query_language-system-reload-functions}
|
||||
|
||||
Reloads all registered [executable user defined functions](../functions/index.md#executable-user-defined-functions) or one of them from a configuration file.
|
||||
|
||||
**Syntax**
|
||||
|
||||
```sql
|
||||
RELOAD FUNCTIONS
|
||||
RELOAD FUNCTION function_name
|
||||
```
|
||||
|
||||
## DROP DNS CACHE {#query_language-system-drop-dns-cache}
|
||||
|
||||
Resets ClickHouse’s internal DNS cache. Sometimes (for old ClickHouse versions) it is necessary to use this command when changing the infrastructure (changing the IP address of another ClickHouse server or the server used by dictionaries).
|
||||
|
@ -30,7 +30,7 @@ Debian や Ubuntu 用にコンパイル済みの公式パッケージ `deb` を
|
||||
|
||||
最新版を使いたい場合は、`stable`を`testing`に置き換えてください。(テスト環境ではこれを推奨します)
|
||||
|
||||
同様に、[こちら](https://repo.clickhouse.tech/deb/stable/main/)からパッケージをダウンロードして、手動でインストールすることもできます。
|
||||
同様に、[こちら](https://repo.clickhouse.com/deb/stable/main/)からパッケージをダウンロードして、手動でインストールすることもできます。
|
||||
|
||||
#### パッケージ {#packages}
|
||||
|
||||
@ -47,8 +47,8 @@ CentOS、RedHat、その他すべてのrpmベースのLinuxディストリビュ
|
||||
|
||||
``` bash
|
||||
sudo yum install yum-utils
|
||||
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
|
||||
sudo rpm --import https://repo.clickhouse.com/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.com/rpm/stable/x86_64
|
||||
```
|
||||
|
||||
最新版を使いたい場合は `stable` を `testing` に置き換えてください。(テスト環境ではこれが推奨されています)。`prestable` もしばしば同様に利用できます。
|
||||
@ -59,20 +59,20 @@ sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_6
|
||||
sudo yum install clickhouse-server clickhouse-client
|
||||
```
|
||||
|
||||
同様に、[こちら](https://repo.clickhouse.tech/rpm/stable/x86_64) からパッケージをダウンロードして、手動でインストールすることもできます。
|
||||
同様に、[こちら](https://repo.clickhouse.com/rpm/stable/x86_64) からパッケージをダウンロードして、手動でインストールすることもできます。
|
||||
|
||||
### Tgzアーカイブから {#from-tgz-archives}
|
||||
|
||||
すべての Linux ディストリビューションで、`deb` や `rpm` パッケージがインストールできない場合は、公式のコンパイル済み `tgz` アーカイブを使用することをお勧めします。
|
||||
|
||||
必要なバージョンは、リポジトリ https://repo.clickhouse.tech/tgz/ から `curl` または `wget` でダウンロードできます。その後、ダウンロードしたアーカイブを解凍し、インストールスクリプトでインストールしてください。最新版の例は以下です:
|
||||
必要なバージョンは、リポジトリ https://repo.clickhouse.com/tgz/ から `curl` または `wget` でダウンロードできます。その後、ダウンロードしたアーカイブを解凍し、インストールスクリプトでインストールしてください。最新版の例は以下です:
|
||||
|
||||
``` bash
|
||||
export LATEST_VERSION=`curl https://api.github.com/repos/ClickHouse/ClickHouse/tags 2>/dev/null | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -n 1`
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
|
||||
tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh
|
||||
|
@ -9,12 +9,16 @@ toc_title: "Введение"
|
||||
Этот раздел описывает как получить тестовые массивы данных и загрузить их в ClickHouse.
|
||||
Для некоторых тестовых массивов данных также доступны тестовые запросы.
|
||||
|
||||
- [Анонимизированные данные Яндекс.Метрики](metrica.md)
|
||||
- [Star Schema Benchmark](star-schema.md)
|
||||
- [WikiStat](wikistat.md)
|
||||
- [Терабайт логов кликов от Criteo](criteo.md)
|
||||
- [AMPLab Big Data Benchmark](amplab-benchmark.md)
|
||||
- [Данные о такси в Нью-Йорке](nyc-taxi.md)
|
||||
- [OnTime](ontime.md)
|
||||
- [Анонимизированные данные Яндекс.Метрики](../../getting-started/example-datasets/metrica.md)
|
||||
- [Star Schema Benchmark](../../getting-started/example-datasets/star-schema.md)
|
||||
- [Набор данных кулинарных рецептов](../../getting-started/example-datasets/recipes.md)
|
||||
- [WikiStat](../../getting-started/example-datasets/wikistat.md)
|
||||
- [Терабайт логов кликов от Criteo](../../getting-started/example-datasets/criteo.md)
|
||||
- [AMPLab Big Data Benchmark](../../getting-started/example-datasets/amplab-benchmark.md)
|
||||
- [Данные о такси в Нью-Йорке](../../getting-started/example-datasets/nyc-taxi.md)
|
||||
- [Набор данных о воздушном движении OpenSky Network 2020](../../getting-started/example-datasets/opensky.md)
|
||||
- [Данные о стоимости недвижимости в Великобритании](../../getting-started/example-datasets/uk-price-paid.md)
|
||||
- [OnTime](../../getting-started/example-datasets/ontime.md)
|
||||
- [Вышки сотовой связи](../../getting-started/example-datasets/cell-towers.md)
|
||||
|
||||
[Оригинальная статья](https://clickhouse.tech/docs/ru/getting_started/example_datasets) <!--hide-->
|
||||
|
@ -1 +0,0 @@
|
||||
../../../en/getting-started/example-datasets/menus.md
|
360
docs/ru/getting-started/example-datasets/menus.md
Normal file
360
docs/ru/getting-started/example-datasets/menus.md
Normal file
@ -0,0 +1,360 @@
|
||||
---
|
||||
toc_priority: 21
|
||||
toc_title: Меню
|
||||
---
|
||||
|
||||
# Набор данных публичной библиотеки Нью-Йорка "Что в меню?" {#menus-dataset}
|
||||
|
||||
Набор данных создан Нью-Йоркской публичной библиотекой. Он содержит исторические данные о меню отелей, ресторанов и кафе с блюдами, а также их ценами.
|
||||
|
||||
Источник: http://menus.nypl.org/data
|
||||
Эти данные находятся в открытом доступе.
|
||||
|
||||
Данные взяты из архива библиотеки, и они могут быть неполными и сложными для статистического анализа. Тем не менее, это тоже очень интересно.
|
||||
В наборе всего 1,3 миллиона записей о блюдах в меню — очень небольшой объем данных для ClickHouse, но это все равно хороший пример.
|
||||
|
||||
## Загрузите набор данных {#download-dataset}
|
||||
|
||||
Выполните команду:
|
||||
|
||||
```bash
|
||||
wget https://s3.amazonaws.com/menusdata.nypl.org/gzips/2021_08_01_07_01_17_data.tgz
|
||||
```
|
||||
|
||||
При необходимости замените ссылку на актуальную ссылку с http://menus.nypl.org/data.
|
||||
Размер архива составляет около 35 МБ.
|
||||
|
||||
## Распакуйте набор данных {#unpack-dataset}
|
||||
|
||||
```bash
|
||||
tar xvf 2021_08_01_07_01_17_data.tgz
|
||||
```
|
||||
|
||||
Размер распакованных данных составляет около 150 МБ.
|
||||
|
||||
Данные нормализованы и состоят из четырех таблиц:
|
||||
- `Menu` — информация о меню: название ресторана, дата, когда было просмотрено меню, и т.д.
|
||||
- `Dish` — информация о блюдах: название блюда вместе с некоторыми характеристиками.
|
||||
- `MenuPage` — информация о страницах в меню, потому что каждая страница принадлежит какому-либо меню.
|
||||
- `MenuItem` — один из пунктов меню. Блюдо вместе с его ценой на какой-либо странице меню: ссылки на блюдо и страницу меню.
|
||||
|
||||
## Создайте таблицы {#create-tables}
|
||||
|
||||
Для хранения цен используется тип данных [Decimal](../../sql-reference/data-types/decimal.md).
|
||||
|
||||
```sql
|
||||
CREATE TABLE dish
|
||||
(
|
||||
id UInt32,
|
||||
name String,
|
||||
description String,
|
||||
menus_appeared UInt32,
|
||||
times_appeared Int32,
|
||||
first_appeared UInt16,
|
||||
last_appeared UInt16,
|
||||
lowest_price Decimal64(3),
|
||||
highest_price Decimal64(3)
|
||||
) ENGINE = MergeTree ORDER BY id;
|
||||
|
||||
CREATE TABLE menu
|
||||
(
|
||||
id UInt32,
|
||||
name String,
|
||||
sponsor String,
|
||||
event String,
|
||||
venue String,
|
||||
place String,
|
||||
physical_description String,
|
||||
occasion String,
|
||||
notes String,
|
||||
call_number String,
|
||||
keywords String,
|
||||
language String,
|
||||
date String,
|
||||
location String,
|
||||
location_type String,
|
||||
currency String,
|
||||
currency_symbol String,
|
||||
status String,
|
||||
page_count UInt16,
|
||||
dish_count UInt16
|
||||
) ENGINE = MergeTree ORDER BY id;
|
||||
|
||||
CREATE TABLE menu_page
|
||||
(
|
||||
id UInt32,
|
||||
menu_id UInt32,
|
||||
page_number UInt16,
|
||||
image_id String,
|
||||
full_height UInt16,
|
||||
full_width UInt16,
|
||||
uuid UUID
|
||||
) ENGINE = MergeTree ORDER BY id;
|
||||
|
||||
CREATE TABLE menu_item
|
||||
(
|
||||
id UInt32,
|
||||
menu_page_id UInt32,
|
||||
price Decimal64(3),
|
||||
high_price Decimal64(3),
|
||||
dish_id UInt32,
|
||||
created_at DateTime,
|
||||
updated_at DateTime,
|
||||
xpos Float64,
|
||||
ypos Float64
|
||||
) ENGINE = MergeTree ORDER BY id;
|
||||
```
|
||||
|
||||
## Импортируйте данные {#import-data}
|
||||
|
||||
Импортируйте данные в ClickHouse, выполните команды:
|
||||
|
||||
```bash
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO dish FORMAT CSVWithNames" < Dish.csv
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO menu FORMAT CSVWithNames" < Menu.csv
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --query "INSERT INTO menu_page FORMAT CSVWithNames" < MenuPage.csv
|
||||
clickhouse-client --format_csv_allow_single_quotes 0 --input_format_null_as_default 0 --date_time_input_format best_effort --query "INSERT INTO menu_item FORMAT CSVWithNames" < MenuItem.csv
|
||||
```
|
||||
|
||||
Поскольку данные представлены в формате CSV с заголовком, используется формат [CSVWithNames](../../interfaces/formats.md#csvwithnames).
|
||||
|
||||
Отключите `format_csv_allow_single_quotes`, так как для данных используются только двойные кавычки, а одинарные кавычки могут находиться внутри значений и не должны сбивать с толку CSV-парсер.
|
||||
|
||||
Отключите [input_format_null_as_default](../../operations/settings/settings.md#settings-input-format-null-as-default), поскольку в данных нет значений [NULL](../../sql-reference/syntax.md#null-literal).
|
||||
|
||||
В противном случае ClickHouse попытается проанализировать последовательности `\N` и может перепутать с `\` в данных.
|
||||
|
||||
Настройка [date_time_input_format best_effort](../../operations/settings/settings.md#settings-date_time_input_format) позволяет анализировать поля [DateTime](../../sql-reference/data-types/datetime.md) в самых разных форматах. К примеру, будет распознан ISO-8601 без секунд: '2000-01-01 01:02'. Без этой настройки допускается только фиксированный формат даты и времени.
|
||||
|
||||
## Денормализуйте данные {#denormalize-data}
|
||||
|
||||
Данные представлены в нескольких таблицах в [нормализованном виде](https://ru.wikipedia.org/wiki/%D0%9D%D0%BE%D1%80%D0%BC%D0%B0%D0%BB%D1%8C%D0%BD%D0%B0%D1%8F_%D1%84%D0%BE%D1%80%D0%BC%D0%B0).
|
||||
|
||||
Это означает, что вам нужно использовать условие объединения [JOIN](../../sql-reference/statements/select/join.md#select-join), если вы хотите получить, например, названия блюд из пунктов меню.
|
||||
|
||||
Для типовых аналитических задач гораздо эффективнее работать с предварительно объединенными данными, чтобы не использовать `JOIN` каждый раз. Такие данные называются денормализованными.
|
||||
|
||||
Создайте таблицу `menu_item_denorm`, которая будет содержать все данные, объединенные вместе:
|
||||
|
||||
```sql
|
||||
CREATE TABLE menu_item_denorm
|
||||
ENGINE = MergeTree ORDER BY (dish_name, created_at)
|
||||
AS SELECT
|
||||
price,
|
||||
high_price,
|
||||
created_at,
|
||||
updated_at,
|
||||
xpos,
|
||||
ypos,
|
||||
dish.id AS dish_id,
|
||||
dish.name AS dish_name,
|
||||
dish.description AS dish_description,
|
||||
dish.menus_appeared AS dish_menus_appeared,
|
||||
dish.times_appeared AS dish_times_appeared,
|
||||
dish.first_appeared AS dish_first_appeared,
|
||||
dish.last_appeared AS dish_last_appeared,
|
||||
dish.lowest_price AS dish_lowest_price,
|
||||
dish.highest_price AS dish_highest_price,
|
||||
menu.id AS menu_id,
|
||||
menu.name AS menu_name,
|
||||
menu.sponsor AS menu_sponsor,
|
||||
menu.event AS menu_event,
|
||||
menu.venue AS menu_venue,
|
||||
menu.place AS menu_place,
|
||||
menu.physical_description AS menu_physical_description,
|
||||
menu.occasion AS menu_occasion,
|
||||
menu.notes AS menu_notes,
|
||||
menu.call_number AS menu_call_number,
|
||||
menu.keywords AS menu_keywords,
|
||||
menu.language AS menu_language,
|
||||
menu.date AS menu_date,
|
||||
menu.location AS menu_location,
|
||||
menu.location_type AS menu_location_type,
|
||||
menu.currency AS menu_currency,
|
||||
menu.currency_symbol AS menu_currency_symbol,
|
||||
menu.status AS menu_status,
|
||||
menu.page_count AS menu_page_count,
|
||||
menu.dish_count AS menu_dish_count
|
||||
FROM menu_item
|
||||
JOIN dish ON menu_item.dish_id = dish.id
|
||||
JOIN menu_page ON menu_item.menu_page_id = menu_page.id
|
||||
JOIN menu ON menu_page.menu_id = menu.id;
|
||||
```
|
||||
|
||||
## Проверьте загруженные данные {#validate-data}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT count() FROM menu_item_denorm;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─count()─┐
|
||||
│ 1329175 │
|
||||
└─────────┘
|
||||
```
|
||||
|
||||
## Примеры запросов {#run-queries}
|
||||
|
||||
### Усредненные исторические цены на блюда {#query-averaged-historical-prices}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
round(avg(price), 2),
|
||||
bar(avg(price), 0, 100, 100)
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency = 'Dollars') AND (d > 0) AND (d < 2022)
|
||||
GROUP BY d
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 100, 100)─┐
|
||||
│ 1850 │ 618 │ 1.5 │ █▍ │
|
||||
│ 1860 │ 1634 │ 1.29 │ █▎ │
|
||||
│ 1870 │ 2215 │ 1.36 │ █▎ │
|
||||
│ 1880 │ 3909 │ 1.01 │ █ │
|
||||
│ 1890 │ 8837 │ 1.4 │ █▍ │
|
||||
│ 1900 │ 176292 │ 0.68 │ ▋ │
|
||||
│ 1910 │ 212196 │ 0.88 │ ▊ │
|
||||
│ 1920 │ 179590 │ 0.74 │ ▋ │
|
||||
│ 1930 │ 73707 │ 0.6 │ ▌ │
|
||||
│ 1940 │ 58795 │ 0.57 │ ▌ │
|
||||
│ 1950 │ 41407 │ 0.95 │ ▊ │
|
||||
│ 1960 │ 51179 │ 1.32 │ █▎ │
|
||||
│ 1970 │ 12914 │ 1.86 │ █▋ │
|
||||
│ 1980 │ 7268 │ 4.35 │ ████▎ │
|
||||
│ 1990 │ 11055 │ 6.03 │ ██████ │
|
||||
│ 2000 │ 2467 │ 11.85 │ ███████████▋ │
|
||||
│ 2010 │ 597 │ 25.66 │ █████████████████████████▋ │
|
||||
└──────┴─────────┴──────────────────────┴──────────────────────────────┘
|
||||
```
|
||||
|
||||
Просто не принимайте это всерьез.
|
||||
|
||||
### Цены на бургеры {#query-burger-prices}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
round(avg(price), 2),
|
||||
bar(avg(price), 0, 50, 100)
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency = 'Dollars') AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%burger%')
|
||||
GROUP BY d
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)───────────┐
|
||||
│ 1880 │ 2 │ 0.42 │ ▋ │
|
||||
│ 1890 │ 7 │ 0.85 │ █▋ │
|
||||
│ 1900 │ 399 │ 0.49 │ ▊ │
|
||||
│ 1910 │ 589 │ 0.68 │ █▎ │
|
||||
│ 1920 │ 280 │ 0.56 │ █ │
|
||||
│ 1930 │ 74 │ 0.42 │ ▋ │
|
||||
│ 1940 │ 119 │ 0.59 │ █▏ │
|
||||
│ 1950 │ 134 │ 1.09 │ ██▏ │
|
||||
│ 1960 │ 272 │ 0.92 │ █▋ │
|
||||
│ 1970 │ 108 │ 1.18 │ ██▎ │
|
||||
│ 1980 │ 88 │ 2.82 │ █████▋ │
|
||||
│ 1990 │ 184 │ 3.68 │ ███████▎ │
|
||||
│ 2000 │ 21 │ 7.14 │ ██████████████▎ │
|
||||
│ 2010 │ 6 │ 18.42 │ ████████████████████████████████████▋ │
|
||||
└──────┴─────────┴──────────────────────┴───────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Водка {#query-vodka}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
round(avg(price), 2),
|
||||
bar(avg(price), 0, 50, 100)
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency IN ('Dollars', '')) AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%vodka%')
|
||||
GROUP BY d
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)─┐
|
||||
│ 1910 │ 2 │ 0 │ │
|
||||
│ 1920 │ 1 │ 0.3 │ ▌ │
|
||||
│ 1940 │ 21 │ 0.42 │ ▋ │
|
||||
│ 1950 │ 14 │ 0.59 │ █▏ │
|
||||
│ 1960 │ 113 │ 2.17 │ ████▎ │
|
||||
│ 1970 │ 37 │ 0.68 │ █▎ │
|
||||
│ 1980 │ 19 │ 2.55 │ █████ │
|
||||
│ 1990 │ 86 │ 3.6 │ ███████▏ │
|
||||
│ 2000 │ 2 │ 3.98 │ ███████▊ │
|
||||
└──────┴─────────┴──────────────────────┴─────────────────────────────┘
|
||||
```
|
||||
|
||||
Чтобы получить водку, мы должны написать `ILIKE '%vodka%'`, и это хорошая идея.
|
||||
|
||||
### Икра {#query-caviar}
|
||||
|
||||
Посмотрите цены на икру. Получите название любого блюда с икрой.
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
round(toUInt32OrZero(extract(menu_date, '^\\d{4}')), -1) AS d,
|
||||
count(),
|
||||
round(avg(price), 2),
|
||||
bar(avg(price), 0, 50, 100),
|
||||
any(dish_name)
|
||||
FROM menu_item_denorm
|
||||
WHERE (menu_currency IN ('Dollars', '')) AND (d > 0) AND (d < 2022) AND (dish_name ILIKE '%caviar%')
|
||||
GROUP BY d
|
||||
ORDER BY d ASC;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌────d─┬─count()─┬─round(avg(price), 2)─┬─bar(avg(price), 0, 50, 100)──────┬─any(dish_name)──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ 1090 │ 1 │ 0 │ │ Caviar │
|
||||
│ 1880 │ 3 │ 0 │ │ Caviar │
|
||||
│ 1890 │ 39 │ 0.59 │ █▏ │ Butter and caviar │
|
||||
│ 1900 │ 1014 │ 0.34 │ ▋ │ Anchovy Caviar on Toast │
|
||||
│ 1910 │ 1588 │ 1.35 │ ██▋ │ 1/1 Brötchen Caviar │
|
||||
│ 1920 │ 927 │ 1.37 │ ██▋ │ ASTRAKAN CAVIAR │
|
||||
│ 1930 │ 289 │ 1.91 │ ███▋ │ Astrachan caviar │
|
||||
│ 1940 │ 201 │ 0.83 │ █▋ │ (SPECIAL) Domestic Caviar Sandwich │
|
||||
│ 1950 │ 81 │ 2.27 │ ████▌ │ Beluga Caviar │
|
||||
│ 1960 │ 126 │ 2.21 │ ████▍ │ Beluga Caviar │
|
||||
│ 1970 │ 105 │ 0.95 │ █▊ │ BELUGA MALOSSOL CAVIAR AMERICAN DRESSING │
|
||||
│ 1980 │ 12 │ 7.22 │ ██████████████▍ │ Authentic Iranian Beluga Caviar the world's finest black caviar presented in ice garni and a sampling of chilled 100° Russian vodka │
|
||||
│ 1990 │ 74 │ 14.42 │ ████████████████████████████▋ │ Avocado Salad, Fresh cut avocado with caviare │
|
||||
│ 2000 │ 3 │ 7.82 │ ███████████████▋ │ Aufgeschlagenes Kartoffelsueppchen mit Forellencaviar │
|
||||
│ 2010 │ 6 │ 15.58 │ ███████████████████████████████▏ │ "OYSTERS AND PEARLS" "Sabayon" of Pearl Tapioca with Island Creek Oysters and Russian Sevruga Caviar │
|
||||
└──────┴─────────┴──────────────────────┴──────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
По крайней мере, есть икра с водкой. Очень мило.
|
||||
|
||||
## Online Playground {#playground}
|
||||
|
||||
Этот набор данных доступен в интерактивном ресурсе [Online Playground](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUCiAgICByb3VuZCh0b1VJbnQzMk9yWmVybyhleHRyYWN0KG1lbnVfZGF0ZSwgJ15cXGR7NH0nKSksIC0xKSBBUyBkLAogICAgY291bnQoKSwKICAgIHJvdW5kKGF2ZyhwcmljZSksIDIpLAogICAgYmFyKGF2ZyhwcmljZSksIDAsIDUwLCAxMDApLAogICAgYW55KGRpc2hfbmFtZSkKRlJPTSBtZW51X2l0ZW1fZGVub3JtCldIRVJFIChtZW51X2N1cnJlbmN5IElOICgnRG9sbGFycycsICcnKSkgQU5EIChkID4gMCkgQU5EIChkIDwgMjAyMikgQU5EIChkaXNoX25hbWUgSUxJS0UgJyVjYXZpYXIlJykKR1JPVVAgQlkgZApPUkRFUiBCWSBkIEFTQw==).
|
@ -1 +0,0 @@
|
||||
../../../en/getting-started/example-datasets/opensky.md
|
422
docs/ru/getting-started/example-datasets/opensky.md
Normal file
422
docs/ru/getting-started/example-datasets/opensky.md
Normal file
@ -0,0 +1,422 @@
|
||||
---
|
||||
toc_priority: 20
|
||||
toc_title: Набор данных о воздушном движении OpenSky Network 2020
|
||||
---
|
||||
|
||||
# Набор данных о воздушном движении OpenSky Network 2020 {#opensky}
|
||||
|
||||
"Данные в этом наборе получены и отфильтрованы из полного набора данных OpenSky, чтобы проиллюстрировать развитие воздушного движения во время пандемии COVID-19. Набор включает в себя все рейсы, которые видели более 2500 участников сети с 1 января 2019 года. Дополнительные данные будут периодически включаться в набор данных до окончания пандемии COVID-19".
|
||||
|
||||
Источник: https://zenodo.org/record/5092942#.YRBCyTpRXYd
|
||||
|
||||
Martin Strohmeier, Xavier Olive, Jannis Lübbe, Matthias Schäfer, and Vincent Lenders
|
||||
"Crowdsourced air traffic data from the OpenSky Network 2019–2020"
|
||||
Earth System Science Data 13(2), 2021
|
||||
https://doi.org/10.5194/essd-13-357-2021
|
||||
|
||||
## Загрузите набор данных {#download-dataset}
|
||||
|
||||
Выполните команду:
|
||||
|
||||
```bash
|
||||
wget -O- https://zenodo.org/record/5092942 | grep -oP 'https://zenodo.org/record/5092942/files/flightlist_\d+_\d+\.csv\.gz' | xargs wget
|
||||
```
|
||||
|
||||
Загрузка займет около 2 минут при хорошем подключении к интернету. Будет загружено 30 файлов общим размером 4,3 ГБ.
|
||||
|
||||
## Создайте таблицу {#create-table}
|
||||
|
||||
```sql
|
||||
CREATE TABLE opensky
|
||||
(
|
||||
callsign String,
|
||||
number String,
|
||||
icao24 String,
|
||||
registration String,
|
||||
typecode String,
|
||||
origin String,
|
||||
destination String,
|
||||
firstseen DateTime,
|
||||
lastseen DateTime,
|
||||
day DateTime,
|
||||
latitude_1 Float64,
|
||||
longitude_1 Float64,
|
||||
altitude_1 Float64,
|
||||
latitude_2 Float64,
|
||||
longitude_2 Float64,
|
||||
altitude_2 Float64
|
||||
) ENGINE = MergeTree ORDER BY (origin, destination, callsign);
|
||||
```
|
||||
|
||||
## Импортируйте данные в ClickHouse {#import-data}
|
||||
|
||||
Загрузите данные в ClickHouse параллельными потоками:
|
||||
|
||||
```bash
|
||||
ls -1 flightlist_*.csv.gz | xargs -P100 -I{} bash -c 'gzip -c -d "{}" | clickhouse-client --date_time_input_format best_effort --query "INSERT INTO opensky FORMAT CSVWithNames"'
|
||||
```
|
||||
|
||||
- Список файлов передаётся (`ls -1 flightlist_*.csv.gz`) в `xargs` для параллельной обработки.
|
||||
- `xargs -P100` указывает на возможность использования до 100 параллельных обработчиков, но поскольку у нас всего 30 файлов, то количество обработчиков будет всего 30.
|
||||
- Для каждого файла `xargs` будет запускать скрипт с `bash -c`. Сценарий имеет подстановку в виде `{}`, а команда `xargs` заменяет имя файла на указанные в подстановке символы (мы указали это для `xargs` с помощью `-I{}`).
|
||||
- Скрипт распакует файл (`gzip -c -d "{}"`) в стандартный вывод (параметр `-c`) и перенаправит его в `clickhouse-client`.
|
||||
- Чтобы распознать формат ISO-8601 со смещениями часовых поясов в полях типа [DateTime](../../sql-reference/data-types/datetime.md), указывается параметр парсера [--date_time_input_format best_effort](../../operations/settings/settings.md#settings-date_time_input_format).
|
||||
|
||||
В итоге: клиент clickhouse добавит данные в таблицу `opensky`. Входные данные импортируются в формате [CSVWithNames](../../interfaces/formats.md#csvwithnames).
|
||||
|
||||
|
||||
Загрузка параллельными потоками займёт около 24 секунд.
|
||||
|
||||
Также вы можете использовать вариант последовательной загрузки:
|
||||
|
||||
```bash
|
||||
for file in flightlist_*.csv.gz; do gzip -c -d "$file" | clickhouse-client --date_time_input_format best_effort --query "INSERT INTO opensky FORMAT CSVWithNames"; done
|
||||
```
|
||||
|
||||
## Проверьте импортированные данные {#validate-data}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT count() FROM opensky;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌──count()─┐
|
||||
│ 66010819 │
|
||||
└──────────┘
|
||||
```
|
||||
|
||||
Убедитесь, что размер набора данных в ClickHouse составляет всего 2,66 GiB.
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT formatReadableSize(total_bytes) FROM system.tables WHERE name = 'opensky';
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─formatReadableSize(total_bytes)─┐
|
||||
│ 2.66 GiB │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Примеры {#run-queries}
|
||||
|
||||
Общее пройденное расстояние составляет 68 миллиардов километров.
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT formatReadableQuantity(sum(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)) / 1000) FROM opensky;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─formatReadableQuantity(divide(sum(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)), 1000))─┐
|
||||
│ 68.72 billion │
|
||||
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Средняя дальность полета составляет около 1000 км.
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT avg(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2)) FROM opensky;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─avg(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2))─┐
|
||||
│ 1041090.6465708319 │
|
||||
└────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Наиболее загруженные аэропорты в указанных координатах и среднее пройденное расстояние {#busy-airports-average-distance}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
origin,
|
||||
count(),
|
||||
round(avg(geoDistance(longitude_1, latitude_1, longitude_2, latitude_2))) AS distance,
|
||||
bar(distance, 0, 10000000, 100) AS bar
|
||||
FROM opensky
|
||||
WHERE origin != ''
|
||||
GROUP BY origin
|
||||
ORDER BY count() DESC
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─origin─┬─count()─┬─distance─┬─bar────────────────────────────────────┐
|
||||
1. │ KORD │ 745007 │ 1546108 │ ███████████████▍ │
|
||||
2. │ KDFW │ 696702 │ 1358721 │ █████████████▌ │
|
||||
3. │ KATL │ 667286 │ 1169661 │ ███████████▋ │
|
||||
4. │ KDEN │ 582709 │ 1287742 │ ████████████▊ │
|
||||
5. │ KLAX │ 581952 │ 2628393 │ ██████████████████████████▎ │
|
||||
6. │ KLAS │ 447789 │ 1336967 │ █████████████▎ │
|
||||
7. │ KPHX │ 428558 │ 1345635 │ █████████████▍ │
|
||||
8. │ KSEA │ 412592 │ 1757317 │ █████████████████▌ │
|
||||
9. │ KCLT │ 404612 │ 880355 │ ████████▋ │
|
||||
10. │ VIDP │ 363074 │ 1445052 │ ██████████████▍ │
|
||||
11. │ EDDF │ 362643 │ 2263960 │ ██████████████████████▋ │
|
||||
12. │ KSFO │ 361869 │ 2445732 │ ████████████████████████▍ │
|
||||
13. │ KJFK │ 349232 │ 2996550 │ █████████████████████████████▊ │
|
||||
14. │ KMSP │ 346010 │ 1287328 │ ████████████▋ │
|
||||
15. │ LFPG │ 344748 │ 2206203 │ ██████████████████████ │
|
||||
16. │ EGLL │ 341370 │ 3216593 │ ████████████████████████████████▏ │
|
||||
17. │ EHAM │ 340272 │ 2116425 │ █████████████████████▏ │
|
||||
18. │ KEWR │ 337696 │ 1826545 │ ██████████████████▎ │
|
||||
19. │ KPHL │ 320762 │ 1291761 │ ████████████▊ │
|
||||
20. │ OMDB │ 308855 │ 2855706 │ ████████████████████████████▌ │
|
||||
21. │ UUEE │ 307098 │ 1555122 │ ███████████████▌ │
|
||||
22. │ KBOS │ 304416 │ 1621675 │ ████████████████▏ │
|
||||
23. │ LEMD │ 291787 │ 1695097 │ ████████████████▊ │
|
||||
24. │ YSSY │ 272979 │ 1875298 │ ██████████████████▋ │
|
||||
25. │ KMIA │ 265121 │ 1923542 │ ███████████████████▏ │
|
||||
26. │ ZGSZ │ 263497 │ 745086 │ ███████▍ │
|
||||
27. │ EDDM │ 256691 │ 1361453 │ █████████████▌ │
|
||||
28. │ WMKK │ 254264 │ 1626688 │ ████████████████▎ │
|
||||
29. │ CYYZ │ 251192 │ 2175026 │ █████████████████████▋ │
|
||||
30. │ KLGA │ 248699 │ 1106935 │ ███████████ │
|
||||
31. │ VHHH │ 248473 │ 3457658 │ ██████████████████████████████████▌ │
|
||||
32. │ RJTT │ 243477 │ 1272744 │ ████████████▋ │
|
||||
33. │ KBWI │ 241440 │ 1187060 │ ███████████▋ │
|
||||
34. │ KIAD │ 239558 │ 1683485 │ ████████████████▋ │
|
||||
35. │ KIAH │ 234202 │ 1538335 │ ███████████████▍ │
|
||||
36. │ KFLL │ 223447 │ 1464410 │ ██████████████▋ │
|
||||
37. │ KDAL │ 212055 │ 1082339 │ ██████████▋ │
|
||||
38. │ KDCA │ 207883 │ 1013359 │ ██████████▏ │
|
||||
39. │ LIRF │ 207047 │ 1427965 │ ██████████████▎ │
|
||||
40. │ PANC │ 206007 │ 2525359 │ █████████████████████████▎ │
|
||||
41. │ LTFJ │ 205415 │ 860470 │ ████████▌ │
|
||||
42. │ KDTW │ 204020 │ 1106716 │ ███████████ │
|
||||
43. │ VABB │ 201679 │ 1300865 │ █████████████ │
|
||||
44. │ OTHH │ 200797 │ 3759544 │ █████████████████████████████████████▌ │
|
||||
45. │ KMDW │ 200796 │ 1232551 │ ████████████▎ │
|
||||
46. │ KSAN │ 198003 │ 1495195 │ ██████████████▊ │
|
||||
47. │ KPDX │ 197760 │ 1269230 │ ████████████▋ │
|
||||
48. │ SBGR │ 197624 │ 2041697 │ ████████████████████▍ │
|
||||
49. │ VOBL │ 189011 │ 1040180 │ ██████████▍ │
|
||||
50. │ LEBL │ 188956 │ 1283190 │ ████████████▋ │
|
||||
51. │ YBBN │ 188011 │ 1253405 │ ████████████▌ │
|
||||
52. │ LSZH │ 187934 │ 1572029 │ ███████████████▋ │
|
||||
53. │ YMML │ 187643 │ 1870076 │ ██████████████████▋ │
|
||||
54. │ RCTP │ 184466 │ 2773976 │ ███████████████████████████▋ │
|
||||
55. │ KSNA │ 180045 │ 778484 │ ███████▋ │
|
||||
56. │ EGKK │ 176420 │ 1694770 │ ████████████████▊ │
|
||||
57. │ LOWW │ 176191 │ 1274833 │ ████████████▋ │
|
||||
58. │ UUDD │ 176099 │ 1368226 │ █████████████▋ │
|
||||
59. │ RKSI │ 173466 │ 3079026 │ ██████████████████████████████▋ │
|
||||
60. │ EKCH │ 172128 │ 1229895 │ ████████████▎ │
|
||||
61. │ KOAK │ 171119 │ 1114447 │ ███████████▏ │
|
||||
62. │ RPLL │ 170122 │ 1440735 │ ██████████████▍ │
|
||||
63. │ KRDU │ 167001 │ 830521 │ ████████▎ │
|
||||
64. │ KAUS │ 164524 │ 1256198 │ ████████████▌ │
|
||||
65. │ KBNA │ 163242 │ 1022726 │ ██████████▏ │
|
||||
66. │ KSDF │ 162655 │ 1380867 │ █████████████▋ │
|
||||
67. │ ENGM │ 160732 │ 910108 │ █████████ │
|
||||
68. │ LIMC │ 160696 │ 1564620 │ ███████████████▋ │
|
||||
69. │ KSJC │ 159278 │ 1081125 │ ██████████▋ │
|
||||
70. │ KSTL │ 157984 │ 1026699 │ ██████████▎ │
|
||||
71. │ UUWW │ 156811 │ 1261155 │ ████████████▌ │
|
||||
72. │ KIND │ 153929 │ 987944 │ █████████▊ │
|
||||
73. │ ESSA │ 153390 │ 1203439 │ ████████████ │
|
||||
74. │ KMCO │ 153351 │ 1508657 │ ███████████████ │
|
||||
75. │ KDVT │ 152895 │ 74048 │ ▋ │
|
||||
76. │ VTBS │ 152645 │ 2255591 │ ██████████████████████▌ │
|
||||
77. │ CYVR │ 149574 │ 2027413 │ ████████████████████▎ │
|
||||
78. │ EIDW │ 148723 │ 1503985 │ ███████████████ │
|
||||
79. │ LFPO │ 143277 │ 1152964 │ ███████████▌ │
|
||||
80. │ EGSS │ 140830 │ 1348183 │ █████████████▍ │
|
||||
81. │ KAPA │ 140776 │ 420441 │ ████▏ │
|
||||
82. │ KHOU │ 138985 │ 1068806 │ ██████████▋ │
|
||||
83. │ KTPA │ 138033 │ 1338223 │ █████████████▍ │
|
||||
84. │ KFFZ │ 137333 │ 55397 │ ▌ │
|
||||
85. │ NZAA │ 136092 │ 1581264 │ ███████████████▋ │
|
||||
86. │ YPPH │ 133916 │ 1271550 │ ████████████▋ │
|
||||
87. │ RJBB │ 133522 │ 1805623 │ ██████████████████ │
|
||||
88. │ EDDL │ 133018 │ 1265919 │ ████████████▋ │
|
||||
89. │ ULLI │ 130501 │ 1197108 │ ███████████▊ │
|
||||
90. │ KIWA │ 127195 │ 250876 │ ██▌ │
|
||||
91. │ KTEB │ 126969 │ 1189414 │ ███████████▊ │
|
||||
92. │ VOMM │ 125616 │ 1127757 │ ███████████▎ │
|
||||
93. │ LSGG │ 123998 │ 1049101 │ ██████████▍ │
|
||||
94. │ LPPT │ 122733 │ 1779187 │ █████████████████▋ │
|
||||
95. │ WSSS │ 120493 │ 3264122 │ ████████████████████████████████▋ │
|
||||
96. │ EBBR │ 118539 │ 1579939 │ ███████████████▋ │
|
||||
97. │ VTBD │ 118107 │ 661627 │ ██████▌ │
|
||||
98. │ KVNY │ 116326 │ 692960 │ ██████▊ │
|
||||
99. │ EDDT │ 115122 │ 941740 │ █████████▍ │
|
||||
100. │ EFHK │ 114860 │ 1629143 │ ████████████████▎ │
|
||||
└────────┴─────────┴──────────┴────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Номера рейсов из трех крупных аэропортов Москвы, еженедельно {#flights-from-moscow}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
toMonday(day) AS k,
|
||||
count() AS c,
|
||||
bar(c, 0, 10000, 100) AS bar
|
||||
FROM opensky
|
||||
WHERE origin IN ('UUEE', 'UUDD', 'UUWW')
|
||||
GROUP BY k
|
||||
ORDER BY k ASC;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌──────────k─┬────c─┬─bar──────────────────────────────────────────────────────────────────────────┐
|
||||
1. │ 2018-12-31 │ 5248 │ ████████████████████████████████████████████████████▍ │
|
||||
2. │ 2019-01-07 │ 6302 │ ███████████████████████████████████████████████████████████████ │
|
||||
3. │ 2019-01-14 │ 5701 │ █████████████████████████████████████████████████████████ │
|
||||
4. │ 2019-01-21 │ 5638 │ ████████████████████████████████████████████████████████▍ │
|
||||
5. │ 2019-01-28 │ 5731 │ █████████████████████████████████████████████████████████▎ │
|
||||
6. │ 2019-02-04 │ 5683 │ ████████████████████████████████████████████████████████▋ │
|
||||
7. │ 2019-02-11 │ 5759 │ █████████████████████████████████████████████████████████▌ │
|
||||
8. │ 2019-02-18 │ 5736 │ █████████████████████████████████████████████████████████▎ │
|
||||
9. │ 2019-02-25 │ 5873 │ ██████████████████████████████████████████████████████████▋ │
|
||||
10. │ 2019-03-04 │ 5965 │ ███████████████████████████████████████████████████████████▋ │
|
||||
11. │ 2019-03-11 │ 5900 │ ███████████████████████████████████████████████████████████ │
|
||||
12. │ 2019-03-18 │ 5823 │ ██████████████████████████████████████████████████████████▏ │
|
||||
13. │ 2019-03-25 │ 5899 │ ██████████████████████████████████████████████████████████▊ │
|
||||
14. │ 2019-04-01 │ 6043 │ ████████████████████████████████████████████████████████████▍ │
|
||||
15. │ 2019-04-08 │ 6098 │ ████████████████████████████████████████████████████████████▊ │
|
||||
16. │ 2019-04-15 │ 6196 │ █████████████████████████████████████████████████████████████▊ │
|
||||
17. │ 2019-04-22 │ 6486 │ ████████████████████████████████████████████████████████████████▋ │
|
||||
18. │ 2019-04-29 │ 6682 │ ██████████████████████████████████████████████████████████████████▋ │
|
||||
19. │ 2019-05-06 │ 6739 │ ███████████████████████████████████████████████████████████████████▍ │
|
||||
20. │ 2019-05-13 │ 6600 │ ██████████████████████████████████████████████████████████████████ │
|
||||
21. │ 2019-05-20 │ 6575 │ █████████████████████████████████████████████████████████████████▋ │
|
||||
22. │ 2019-05-27 │ 6786 │ ███████████████████████████████████████████████████████████████████▋ │
|
||||
23. │ 2019-06-03 │ 6872 │ ████████████████████████████████████████████████████████████████████▋ │
|
||||
24. │ 2019-06-10 │ 7045 │ ██████████████████████████████████████████████████████████████████████▍ │
|
||||
25. │ 2019-06-17 │ 7045 │ ██████████████████████████████████████████████████████████████████████▍ │
|
||||
26. │ 2019-06-24 │ 6852 │ ████████████████████████████████████████████████████████████████████▌ │
|
||||
27. │ 2019-07-01 │ 7248 │ ████████████████████████████████████████████████████████████████████████▍ │
|
||||
28. │ 2019-07-08 │ 7284 │ ████████████████████████████████████████████████████████████████████████▋ │
|
||||
29. │ 2019-07-15 │ 7142 │ ███████████████████████████████████████████████████████████████████████▍ │
|
||||
30. │ 2019-07-22 │ 7108 │ ███████████████████████████████████████████████████████████████████████ │
|
||||
31. │ 2019-07-29 │ 7251 │ ████████████████████████████████████████████████████████████████████████▌ │
|
||||
32. │ 2019-08-05 │ 7403 │ ██████████████████████████████████████████████████████████████████████████ │
|
||||
33. │ 2019-08-12 │ 7457 │ ██████████████████████████████████████████████████████████████████████████▌ │
|
||||
34. │ 2019-08-19 │ 7502 │ ███████████████████████████████████████████████████████████████████████████ │
|
||||
35. │ 2019-08-26 │ 7540 │ ███████████████████████████████████████████████████████████████████████████▍ │
|
||||
36. │ 2019-09-02 │ 7237 │ ████████████████████████████████████████████████████████████████████████▎ │
|
||||
37. │ 2019-09-09 │ 7328 │ █████████████████████████████████████████████████████████████████████████▎ │
|
||||
38. │ 2019-09-16 │ 5566 │ ███████████████████████████████████████████████████████▋ │
|
||||
39. │ 2019-09-23 │ 7049 │ ██████████████████████████████████████████████████████████████████████▍ │
|
||||
40. │ 2019-09-30 │ 6880 │ ████████████████████████████████████████████████████████████████████▋ │
|
||||
41. │ 2019-10-07 │ 6518 │ █████████████████████████████████████████████████████████████████▏ │
|
||||
42. │ 2019-10-14 │ 6688 │ ██████████████████████████████████████████████████████████████████▊ │
|
||||
43. │ 2019-10-21 │ 6667 │ ██████████████████████████████████████████████████████████████████▋ │
|
||||
44. │ 2019-10-28 │ 6303 │ ███████████████████████████████████████████████████████████████ │
|
||||
45. │ 2019-11-04 │ 6298 │ ██████████████████████████████████████████████████████████████▊ │
|
||||
46. │ 2019-11-11 │ 6137 │ █████████████████████████████████████████████████████████████▎ │
|
||||
47. │ 2019-11-18 │ 6051 │ ████████████████████████████████████████████████████████████▌ │
|
||||
48. │ 2019-11-25 │ 5820 │ ██████████████████████████████████████████████████████████▏ │
|
||||
49. │ 2019-12-02 │ 5942 │ ███████████████████████████████████████████████████████████▍ │
|
||||
50. │ 2019-12-09 │ 4891 │ ████████████████████████████████████████████████▊ │
|
||||
51. │ 2019-12-16 │ 5682 │ ████████████████████████████████████████████████████████▋ │
|
||||
52. │ 2019-12-23 │ 6111 │ █████████████████████████████████████████████████████████████ │
|
||||
53. │ 2019-12-30 │ 5870 │ ██████████████████████████████████████████████████████████▋ │
|
||||
54. │ 2020-01-06 │ 5953 │ ███████████████████████████████████████████████████████████▌ │
|
||||
55. │ 2020-01-13 │ 5698 │ ████████████████████████████████████████████████████████▊ │
|
||||
56. │ 2020-01-20 │ 5339 │ █████████████████████████████████████████████████████▍ │
|
||||
57. │ 2020-01-27 │ 5566 │ ███████████████████████████████████████████████████████▋ │
|
||||
58. │ 2020-02-03 │ 5801 │ ██████████████████████████████████████████████████████████ │
|
||||
59. │ 2020-02-10 │ 5692 │ ████████████████████████████████████████████████████████▊ │
|
||||
60. │ 2020-02-17 │ 5912 │ ███████████████████████████████████████████████████████████ │
|
||||
61. │ 2020-02-24 │ 6031 │ ████████████████████████████████████████████████████████████▎ │
|
||||
62. │ 2020-03-02 │ 6105 │ █████████████████████████████████████████████████████████████ │
|
||||
63. │ 2020-03-09 │ 5823 │ ██████████████████████████████████████████████████████████▏ │
|
||||
64. │ 2020-03-16 │ 4659 │ ██████████████████████████████████████████████▌ │
|
||||
65. │ 2020-03-23 │ 3720 │ █████████████████████████████████████▏ │
|
||||
66. │ 2020-03-30 │ 1720 │ █████████████████▏ │
|
||||
67. │ 2020-04-06 │ 849 │ ████████▍ │
|
||||
68. │ 2020-04-13 │ 710 │ ███████ │
|
||||
69. │ 2020-04-20 │ 725 │ ███████▏ │
|
||||
70. │ 2020-04-27 │ 920 │ █████████▏ │
|
||||
71. │ 2020-05-04 │ 859 │ ████████▌ │
|
||||
72. │ 2020-05-11 │ 1047 │ ██████████▍ │
|
||||
73. │ 2020-05-18 │ 1135 │ ███████████▎ │
|
||||
74. │ 2020-05-25 │ 1266 │ ████████████▋ │
|
||||
75. │ 2020-06-01 │ 1793 │ █████████████████▊ │
|
||||
76. │ 2020-06-08 │ 1979 │ ███████████████████▋ │
|
||||
77. │ 2020-06-15 │ 2297 │ ██████████████████████▊ │
|
||||
78. │ 2020-06-22 │ 2788 │ ███████████████████████████▊ │
|
||||
79. │ 2020-06-29 │ 3389 │ █████████████████████████████████▊ │
|
||||
80. │ 2020-07-06 │ 3545 │ ███████████████████████████████████▍ │
|
||||
81. │ 2020-07-13 │ 3569 │ ███████████████████████████████████▋ │
|
||||
82. │ 2020-07-20 │ 3784 │ █████████████████████████████████████▋ │
|
||||
83. │ 2020-07-27 │ 3960 │ ███████████████████████████████████████▌ │
|
||||
84. │ 2020-08-03 │ 4323 │ ███████████████████████████████████████████▏ │
|
||||
85. │ 2020-08-10 │ 4581 │ █████████████████████████████████████████████▋ │
|
||||
86. │ 2020-08-17 │ 4791 │ ███████████████████████████████████████████████▊ │
|
||||
87. │ 2020-08-24 │ 4928 │ █████████████████████████████████████████████████▎ │
|
||||
88. │ 2020-08-31 │ 4687 │ ██████████████████████████████████████████████▋ │
|
||||
89. │ 2020-09-07 │ 4643 │ ██████████████████████████████████████████████▍ │
|
||||
90. │ 2020-09-14 │ 4594 │ █████████████████████████████████████████████▊ │
|
||||
91. │ 2020-09-21 │ 4478 │ ████████████████████████████████████████████▋ │
|
||||
92. │ 2020-09-28 │ 4382 │ ███████████████████████████████████████████▋ │
|
||||
93. │ 2020-10-05 │ 4261 │ ██████████████████████████████████████████▌ │
|
||||
94. │ 2020-10-12 │ 4243 │ ██████████████████████████████████████████▍ │
|
||||
95. │ 2020-10-19 │ 3941 │ ███████████████████████████████████████▍ │
|
||||
96. │ 2020-10-26 │ 3616 │ ████████████████████████████████████▏ │
|
||||
97. │ 2020-11-02 │ 3586 │ ███████████████████████████████████▋ │
|
||||
98. │ 2020-11-09 │ 3403 │ ██████████████████████████████████ │
|
||||
99. │ 2020-11-16 │ 3336 │ █████████████████████████████████▎ │
|
||||
100. │ 2020-11-23 │ 3230 │ ████████████████████████████████▎ │
|
||||
101. │ 2020-11-30 │ 3183 │ ███████████████████████████████▋ │
|
||||
102. │ 2020-12-07 │ 3285 │ ████████████████████████████████▋ │
|
||||
103. │ 2020-12-14 │ 3367 │ █████████████████████████████████▋ │
|
||||
104. │ 2020-12-21 │ 3748 │ █████████████████████████████████████▍ │
|
||||
105. │ 2020-12-28 │ 3986 │ ███████████████████████████████████████▋ │
|
||||
106. │ 2021-01-04 │ 3906 │ ███████████████████████████████████████ │
|
||||
107. │ 2021-01-11 │ 3425 │ ██████████████████████████████████▎ │
|
||||
108. │ 2021-01-18 │ 3144 │ ███████████████████████████████▍ │
|
||||
109. │ 2021-01-25 │ 3115 │ ███████████████████████████████▏ │
|
||||
110. │ 2021-02-01 │ 3285 │ ████████████████████████████████▋ │
|
||||
111. │ 2021-02-08 │ 3321 │ █████████████████████████████████▏ │
|
||||
112. │ 2021-02-15 │ 3475 │ ██████████████████████████████████▋ │
|
||||
113. │ 2021-02-22 │ 3549 │ ███████████████████████████████████▍ │
|
||||
114. │ 2021-03-01 │ 3755 │ █████████████████████████████████████▌ │
|
||||
115. │ 2021-03-08 │ 3080 │ ██████████████████████████████▋ │
|
||||
116. │ 2021-03-15 │ 3789 │ █████████████████████████████████████▊ │
|
||||
117. │ 2021-03-22 │ 3804 │ ██████████████████████████████████████ │
|
||||
118. │ 2021-03-29 │ 4238 │ ██████████████████████████████████████████▍ │
|
||||
119. │ 2021-04-05 │ 4307 │ ███████████████████████████████████████████ │
|
||||
120. │ 2021-04-12 │ 4225 │ ██████████████████████████████████████████▎ │
|
||||
121. │ 2021-04-19 │ 4391 │ ███████████████████████████████████████████▊ │
|
||||
122. │ 2021-04-26 │ 4868 │ ████████████████████████████████████████████████▋ │
|
||||
123. │ 2021-05-03 │ 4977 │ █████████████████████████████████████████████████▋ │
|
||||
124. │ 2021-05-10 │ 5164 │ ███████████████████████████████████████████████████▋ │
|
||||
125. │ 2021-05-17 │ 4986 │ █████████████████████████████████████████████████▋ │
|
||||
126. │ 2021-05-24 │ 5024 │ ██████████████████████████████████████████████████▏ │
|
||||
127. │ 2021-05-31 │ 4824 │ ████████████████████████████████████████████████▏ │
|
||||
128. │ 2021-06-07 │ 5652 │ ████████████████████████████████████████████████████████▌ │
|
||||
129. │ 2021-06-14 │ 5613 │ ████████████████████████████████████████████████████████▏ │
|
||||
130. │ 2021-06-21 │ 6061 │ ████████████████████████████████████████████████████████████▌ │
|
||||
131. │ 2021-06-28 │ 2554 │ █████████████████████████▌ │
|
||||
└────────────┴──────┴──────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Online Playground {#playground}
|
||||
|
||||
Вы можете протестировать другие запросы к этому набору данным с помощью интерактивного ресурса [Online Playground](https://gh-api.clickhouse.tech/play?user=play). Например, [вот так](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUCiAgICBvcmlnaW4sCiAgICBjb3VudCgpLAogICAgcm91bmQoYXZnKGdlb0Rpc3RhbmNlKGxvbmdpdHVkZV8xLCBsYXRpdHVkZV8xLCBsb25naXR1ZGVfMiwgbGF0aXR1ZGVfMikpKSBBUyBkaXN0YW5jZSwKICAgIGJhcihkaXN0YW5jZSwgMCwgMTAwMDAwMDAsIDEwMCkgQVMgYmFyCkZST00gb3BlbnNreQpXSEVSRSBvcmlnaW4gIT0gJycKR1JPVVAgQlkgb3JpZ2luCk9SREVSIEJZIGNvdW50KCkgREVTQwpMSU1JVCAxMDA=). Однако обратите внимание, что здесь нельзя создавать временные таблицы.
|
||||
|
@ -1 +0,0 @@
|
||||
../../../en/getting-started/example-datasets/uk-price-paid.md
|
650
docs/ru/getting-started/example-datasets/uk-price-paid.md
Normal file
650
docs/ru/getting-started/example-datasets/uk-price-paid.md
Normal file
@ -0,0 +1,650 @@
|
||||
---
|
||||
toc_priority: 20
|
||||
toc_title: Набор данных о стоимости недвижимости в Великобритании
|
||||
---
|
||||
|
||||
# Набор данных о стоимости недвижимости в Великобритании {#uk-property-price-paid}
|
||||
|
||||
Набор содержит данные о стоимости недвижимости в Англии и Уэльсе. Данные доступны с 1995 года.
|
||||
Размер набора данных в несжатом виде составляет около 4 GiB, а в ClickHouse он займет около 278 MiB.
|
||||
|
||||
Источник: https://www.gov.uk/government/statistical-data-sets/price-paid-data-downloads
|
||||
Описание полей таблицы: https://www.gov.uk/guidance/about-the-price-paid-data
|
||||
|
||||
Набор содержит данные HM Land Registry data © Crown copyright and database right 2021. Эти данные лицензированы в соответствии с Open Government Licence v3.0.
|
||||
|
||||
## Загрузите набор данных {#download-dataset}
|
||||
|
||||
Выполните команду:
|
||||
|
||||
```bash
|
||||
wget http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-complete.csv
|
||||
```
|
||||
|
||||
Загрузка займет около 2 минут при хорошем подключении к интернету.
|
||||
|
||||
## Создайте таблицу {#create-table}
|
||||
|
||||
```sql
|
||||
CREATE TABLE uk_price_paid
|
||||
(
|
||||
price UInt32,
|
||||
date Date,
|
||||
postcode1 LowCardinality(String),
|
||||
postcode2 LowCardinality(String),
|
||||
type Enum8('terraced' = 1, 'semi-detached' = 2, 'detached' = 3, 'flat' = 4, 'other' = 0),
|
||||
is_new UInt8,
|
||||
duration Enum8('freehold' = 1, 'leasehold' = 2, 'unknown' = 0),
|
||||
addr1 String,
|
||||
addr2 String,
|
||||
street LowCardinality(String),
|
||||
locality LowCardinality(String),
|
||||
town LowCardinality(String),
|
||||
district LowCardinality(String),
|
||||
county LowCardinality(String),
|
||||
category UInt8
|
||||
) ENGINE = MergeTree ORDER BY (postcode1, postcode2, addr1, addr2);
|
||||
```
|
||||
|
||||
## Обработайте и импортируйте данные {#preprocess-import-data}
|
||||
|
||||
В этом примере используется `clickhouse-local` для предварительной обработки данных и `clickhouse-client` для импорта данных.
|
||||
|
||||
Указывается структура исходных данных CSV-файла и запрос для предварительной обработки данных с помощью `clickhouse-local`.
|
||||
|
||||
Предварительная обработка включает:
|
||||
- разделение почтового индекса на два разных столбца `postcode1` и `postcode2`, что лучше подходит для хранения данных и выполнения запросов к ним;
|
||||
- преобразование поля `time` в дату, поскольку оно содержит только время 00:00;
|
||||
- поле [UUid](../../sql-reference/data-types/uuid.md) игнорируется, потому что оно не будет использовано для анализа;
|
||||
- преобразование полей `type` и `duration` в более читаемые поля типа `Enum` с помощью функции [transform](../../sql-reference/functions/other-functions.md#transform);
|
||||
- преобразование полей `is_new` и `category` из односимвольной строки (`Y`/`N` и `A`/`B`) в поле [UInt8](../../sql-reference/data-types/int-uint.md#uint8-uint16-uint32-uint64-uint256-int8-int16-int32-int64-int128-int256) со значениями 0 и 1 соответственно.
|
||||
|
||||
Обработанные данные передаются в `clickhouse-client` и импортируются в таблицу ClickHouse потоковым способом.
|
||||
|
||||
```bash
|
||||
clickhouse-local --input-format CSV --structure '
|
||||
uuid String,
|
||||
price UInt32,
|
||||
time DateTime,
|
||||
postcode String,
|
||||
a String,
|
||||
b String,
|
||||
c String,
|
||||
addr1 String,
|
||||
addr2 String,
|
||||
street String,
|
||||
locality String,
|
||||
town String,
|
||||
district String,
|
||||
county String,
|
||||
d String,
|
||||
e String
|
||||
' --query "
|
||||
WITH splitByChar(' ', postcode) AS p
|
||||
SELECT
|
||||
price,
|
||||
toDate(time) AS date,
|
||||
p[1] AS postcode1,
|
||||
p[2] AS postcode2,
|
||||
transform(a, ['T', 'S', 'D', 'F', 'O'], ['terraced', 'semi-detached', 'detached', 'flat', 'other']) AS type,
|
||||
b = 'Y' AS is_new,
|
||||
transform(c, ['F', 'L', 'U'], ['freehold', 'leasehold', 'unknown']) AS duration,
|
||||
addr1,
|
||||
addr2,
|
||||
street,
|
||||
locality,
|
||||
town,
|
||||
district,
|
||||
county,
|
||||
d = 'B' AS category
|
||||
FROM table" --date_time_input_format best_effort < pp-complete.csv | clickhouse-client --query "INSERT INTO uk_price_paid FORMAT TSV"
|
||||
```
|
||||
|
||||
Выполнение запроса займет около 40 секунд.
|
||||
|
||||
## Проверьте импортированные данные {#validate-data}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT count() FROM uk_price_paid;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌──count()─┐
|
||||
│ 26321785 │
|
||||
└──────────┘
|
||||
```
|
||||
|
||||
Размер набора данных в ClickHouse составляет всего 278 MiB, проверьте это.
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT formatReadableSize(total_bytes) FROM system.tables WHERE name = 'uk_price_paid';
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─formatReadableSize(total_bytes)─┐
|
||||
│ 278.80 MiB │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Примеры запросов {#run-queries}
|
||||
|
||||
### Запрос 1. Средняя цена за год {#average-price}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 1000000, 80) FROM uk_price_paid GROUP BY year ORDER BY year;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─year─┬──price─┬─bar(round(avg(price)), 0, 1000000, 80)─┐
|
||||
│ 1995 │ 67932 │ █████▍ │
|
||||
│ 1996 │ 71505 │ █████▋ │
|
||||
│ 1997 │ 78532 │ ██████▎ │
|
||||
│ 1998 │ 85436 │ ██████▋ │
|
||||
│ 1999 │ 96037 │ ███████▋ │
|
||||
│ 2000 │ 107479 │ ████████▌ │
|
||||
│ 2001 │ 118885 │ █████████▌ │
|
||||
│ 2002 │ 137941 │ ███████████ │
|
||||
│ 2003 │ 155889 │ ████████████▍ │
|
||||
│ 2004 │ 178885 │ ██████████████▎ │
|
||||
│ 2005 │ 189351 │ ███████████████▏ │
|
||||
│ 2006 │ 203528 │ ████████████████▎ │
|
||||
│ 2007 │ 219378 │ █████████████████▌ │
|
||||
│ 2008 │ 217056 │ █████████████████▎ │
|
||||
│ 2009 │ 213419 │ █████████████████ │
|
||||
│ 2010 │ 236109 │ ██████████████████▊ │
|
||||
│ 2011 │ 232805 │ ██████████████████▌ │
|
||||
│ 2012 │ 238367 │ ███████████████████ │
|
||||
│ 2013 │ 256931 │ ████████████████████▌ │
|
||||
│ 2014 │ 279915 │ ██████████████████████▍ │
|
||||
│ 2015 │ 297266 │ ███████████████████████▋ │
|
||||
│ 2016 │ 313201 │ █████████████████████████ │
|
||||
│ 2017 │ 346097 │ ███████████████████████████▋ │
|
||||
│ 2018 │ 350116 │ ████████████████████████████ │
|
||||
│ 2019 │ 351013 │ ████████████████████████████ │
|
||||
│ 2020 │ 369420 │ █████████████████████████████▌ │
|
||||
│ 2021 │ 386903 │ ██████████████████████████████▊ │
|
||||
└──────┴────────┴────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Запрос 2. Средняя цена за год в Лондоне {#average-price-london}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 2000000, 100) FROM uk_price_paid WHERE town = 'LONDON' GROUP BY year ORDER BY year;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─year─┬───price─┬─bar(round(avg(price)), 0, 2000000, 100)───────────────┐
|
||||
│ 1995 │ 109116 │ █████▍ │
|
||||
│ 1996 │ 118667 │ █████▊ │
|
||||
│ 1997 │ 136518 │ ██████▋ │
|
||||
│ 1998 │ 152983 │ ███████▋ │
|
||||
│ 1999 │ 180637 │ █████████ │
|
||||
│ 2000 │ 215838 │ ██████████▋ │
|
||||
│ 2001 │ 232994 │ ███████████▋ │
|
||||
│ 2002 │ 263670 │ █████████████▏ │
|
||||
│ 2003 │ 278394 │ █████████████▊ │
|
||||
│ 2004 │ 304666 │ ███████████████▏ │
|
||||
│ 2005 │ 322875 │ ████████████████▏ │
|
||||
│ 2006 │ 356191 │ █████████████████▋ │
|
||||
│ 2007 │ 404054 │ ████████████████████▏ │
|
||||
│ 2008 │ 420741 │ █████████████████████ │
|
||||
│ 2009 │ 427753 │ █████████████████████▍ │
|
||||
│ 2010 │ 480306 │ ████████████████████████ │
|
||||
│ 2011 │ 496274 │ ████████████████████████▋ │
|
||||
│ 2012 │ 519442 │ █████████████████████████▊ │
|
||||
│ 2013 │ 616212 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 724154 │ ████████████████████████████████████▏ │
|
||||
│ 2015 │ 792129 │ ███████████████████████████████████████▌ │
|
||||
│ 2016 │ 843655 │ ██████████████████████████████████████████▏ │
|
||||
│ 2017 │ 982642 │ █████████████████████████████████████████████████▏ │
|
||||
│ 2018 │ 1016835 │ ██████████████████████████████████████████████████▋ │
|
||||
│ 2019 │ 1042849 │ ████████████████████████████████████████████████████▏ │
|
||||
│ 2020 │ 1011889 │ ██████████████████████████████████████████████████▌ │
|
||||
│ 2021 │ 960343 │ ████████████████████████████████████████████████ │
|
||||
└──────┴─────────┴───────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Что-то случилось в 2013 году. Я понятия не имею. Может быть, вы имеете представление о том, что произошло в 2020 году?
|
||||
|
||||
### Запрос 3. Самые дорогие районы {#most-expensive-neighborhoods}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
town,
|
||||
district,
|
||||
count() AS c,
|
||||
round(avg(price)) AS price,
|
||||
bar(price, 0, 5000000, 100)
|
||||
FROM uk_price_paid
|
||||
WHERE date >= '2020-01-01'
|
||||
GROUP BY
|
||||
town,
|
||||
district
|
||||
HAVING c >= 100
|
||||
ORDER BY price DESC
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
|
||||
┌─town─────────────────┬─district───────────────┬────c─┬───price─┬─bar(round(avg(price)), 0, 5000000, 100)────────────────────────────┐
|
||||
│ LONDON │ CITY OF WESTMINSTER │ 3606 │ 3280239 │ █████████████████████████████████████████████████████████████████▌ │
|
||||
│ LONDON │ CITY OF LONDON │ 274 │ 3160502 │ ███████████████████████████████████████████████████████████████▏ │
|
||||
│ LONDON │ KENSINGTON AND CHELSEA │ 2550 │ 2308478 │ ██████████████████████████████████████████████▏ │
|
||||
│ LEATHERHEAD │ ELMBRIDGE │ 114 │ 1897407 │ █████████████████████████████████████▊ │
|
||||
│ LONDON │ CAMDEN │ 3033 │ 1805404 │ ████████████████████████████████████ │
|
||||
│ VIRGINIA WATER │ RUNNYMEDE │ 156 │ 1753247 │ ███████████████████████████████████ │
|
||||
│ WINDLESHAM │ SURREY HEATH │ 108 │ 1677613 │ █████████████████████████████████▌ │
|
||||
│ THORNTON HEATH │ CROYDON │ 546 │ 1671721 │ █████████████████████████████████▍ │
|
||||
│ BARNET │ ENFIELD │ 124 │ 1505840 │ ██████████████████████████████ │
|
||||
│ COBHAM │ ELMBRIDGE │ 387 │ 1237250 │ ████████████████████████▋ │
|
||||
│ LONDON │ ISLINGTON │ 2668 │ 1236980 │ ████████████████████████▋ │
|
||||
│ OXFORD │ SOUTH OXFORDSHIRE │ 321 │ 1220907 │ ████████████████████████▍ │
|
||||
│ LONDON │ RICHMOND UPON THAMES │ 704 │ 1215551 │ ████████████████████████▎ │
|
||||
│ LONDON │ HOUNSLOW │ 671 │ 1207493 │ ████████████████████████▏ │
|
||||
│ ASCOT │ WINDSOR AND MAIDENHEAD │ 407 │ 1183299 │ ███████████████████████▋ │
|
||||
│ BEACONSFIELD │ BUCKINGHAMSHIRE │ 330 │ 1175615 │ ███████████████████████▌ │
|
||||
│ RICHMOND │ RICHMOND UPON THAMES │ 874 │ 1110444 │ ██████████████████████▏ │
|
||||
│ LONDON │ HAMMERSMITH AND FULHAM │ 3086 │ 1053983 │ █████████████████████ │
|
||||
│ SURBITON │ ELMBRIDGE │ 100 │ 1011800 │ ████████████████████▏ │
|
||||
│ RADLETT │ HERTSMERE │ 283 │ 1011712 │ ████████████████████▏ │
|
||||
│ SALCOMBE │ SOUTH HAMS │ 127 │ 1011624 │ ████████████████████▏ │
|
||||
│ WEYBRIDGE │ ELMBRIDGE │ 655 │ 1007265 │ ████████████████████▏ │
|
||||
│ ESHER │ ELMBRIDGE │ 485 │ 986581 │ ███████████████████▋ │
|
||||
│ LEATHERHEAD │ GUILDFORD │ 202 │ 977320 │ ███████████████████▌ │
|
||||
│ BURFORD │ WEST OXFORDSHIRE │ 111 │ 966893 │ ███████████████████▎ │
|
||||
│ BROCKENHURST │ NEW FOREST │ 129 │ 956675 │ ███████████████████▏ │
|
||||
│ HINDHEAD │ WAVERLEY │ 137 │ 953753 │ ███████████████████ │
|
||||
│ GERRARDS CROSS │ BUCKINGHAMSHIRE │ 419 │ 951121 │ ███████████████████ │
|
||||
│ EAST MOLESEY │ ELMBRIDGE │ 192 │ 936769 │ ██████████████████▋ │
|
||||
│ CHALFONT ST GILES │ BUCKINGHAMSHIRE │ 146 │ 925515 │ ██████████████████▌ │
|
||||
│ LONDON │ TOWER HAMLETS │ 4388 │ 918304 │ ██████████████████▎ │
|
||||
│ OLNEY │ MILTON KEYNES │ 235 │ 910646 │ ██████████████████▏ │
|
||||
│ HENLEY-ON-THAMES │ SOUTH OXFORDSHIRE │ 540 │ 902418 │ ██████████████████ │
|
||||
│ LONDON │ SOUTHWARK │ 3885 │ 892997 │ █████████████████▋ │
|
||||
│ KINGSTON UPON THAMES │ KINGSTON UPON THAMES │ 960 │ 885969 │ █████████████████▋ │
|
||||
│ LONDON │ EALING │ 2658 │ 871755 │ █████████████████▍ │
|
||||
│ CRANBROOK │ TUNBRIDGE WELLS │ 431 │ 862348 │ █████████████████▏ │
|
||||
│ LONDON │ MERTON │ 2099 │ 859118 │ █████████████████▏ │
|
||||
│ BELVEDERE │ BEXLEY │ 346 │ 842423 │ ████████████████▋ │
|
||||
│ GUILDFORD │ WAVERLEY │ 143 │ 841277 │ ████████████████▋ │
|
||||
│ HARPENDEN │ ST ALBANS │ 657 │ 841216 │ ████████████████▋ │
|
||||
│ LONDON │ HACKNEY │ 3307 │ 837090 │ ████████████████▋ │
|
||||
│ LONDON │ WANDSWORTH │ 6566 │ 832663 │ ████████████████▋ │
|
||||
│ MAIDENHEAD │ BUCKINGHAMSHIRE │ 123 │ 824299 │ ████████████████▍ │
|
||||
│ KINGS LANGLEY │ DACORUM │ 145 │ 821331 │ ████████████████▍ │
|
||||
│ BERKHAMSTED │ DACORUM │ 543 │ 818415 │ ████████████████▎ │
|
||||
│ GREAT MISSENDEN │ BUCKINGHAMSHIRE │ 226 │ 802807 │ ████████████████ │
|
||||
│ BILLINGSHURST │ CHICHESTER │ 144 │ 797829 │ ███████████████▊ │
|
||||
│ WOKING │ GUILDFORD │ 176 │ 793494 │ ███████████████▋ │
|
||||
│ STOCKBRIDGE │ TEST VALLEY │ 178 │ 793269 │ ███████████████▋ │
|
||||
│ EPSOM │ REIGATE AND BANSTEAD │ 172 │ 791862 │ ███████████████▋ │
|
||||
│ TONBRIDGE │ TUNBRIDGE WELLS │ 360 │ 787876 │ ███████████████▋ │
|
||||
│ TEDDINGTON │ RICHMOND UPON THAMES │ 595 │ 786492 │ ███████████████▋ │
|
||||
│ TWICKENHAM │ RICHMOND UPON THAMES │ 1155 │ 786193 │ ███████████████▋ │
|
||||
│ LYNDHURST │ NEW FOREST │ 102 │ 785593 │ ███████████████▋ │
|
||||
│ LONDON │ LAMBETH │ 5228 │ 774574 │ ███████████████▍ │
|
||||
│ LONDON │ BARNET │ 3955 │ 773259 │ ███████████████▍ │
|
||||
│ OXFORD │ VALE OF WHITE HORSE │ 353 │ 772088 │ ███████████████▍ │
|
||||
│ TONBRIDGE │ MAIDSTONE │ 305 │ 770740 │ ███████████████▍ │
|
||||
│ LUTTERWORTH │ HARBOROUGH │ 538 │ 768634 │ ███████████████▎ │
|
||||
│ WOODSTOCK │ WEST OXFORDSHIRE │ 140 │ 766037 │ ███████████████▎ │
|
||||
│ MIDHURST │ CHICHESTER │ 257 │ 764815 │ ███████████████▎ │
|
||||
│ MARLOW │ BUCKINGHAMSHIRE │ 327 │ 761876 │ ███████████████▏ │
|
||||
│ LONDON │ NEWHAM │ 3237 │ 761784 │ ███████████████▏ │
|
||||
│ ALDERLEY EDGE │ CHESHIRE EAST │ 178 │ 757318 │ ███████████████▏ │
|
||||
│ LUTON │ CENTRAL BEDFORDSHIRE │ 212 │ 754283 │ ███████████████ │
|
||||
│ PETWORTH │ CHICHESTER │ 154 │ 754220 │ ███████████████ │
|
||||
│ ALRESFORD │ WINCHESTER │ 219 │ 752718 │ ███████████████ │
|
||||
│ POTTERS BAR │ WELWYN HATFIELD │ 174 │ 748465 │ ██████████████▊ │
|
||||
│ HASLEMERE │ CHICHESTER │ 128 │ 746907 │ ██████████████▊ │
|
||||
│ TADWORTH │ REIGATE AND BANSTEAD │ 502 │ 743252 │ ██████████████▋ │
|
||||
│ THAMES DITTON │ ELMBRIDGE │ 244 │ 741913 │ ██████████████▋ │
|
||||
│ REIGATE │ REIGATE AND BANSTEAD │ 581 │ 738198 │ ██████████████▋ │
|
||||
│ BOURNE END │ BUCKINGHAMSHIRE │ 138 │ 735190 │ ██████████████▋ │
|
||||
│ SEVENOAKS │ SEVENOAKS │ 1156 │ 730018 │ ██████████████▌ │
|
||||
│ OXTED │ TANDRIDGE │ 336 │ 729123 │ ██████████████▌ │
|
||||
│ INGATESTONE │ BRENTWOOD │ 166 │ 728103 │ ██████████████▌ │
|
||||
│ LONDON │ BRENT │ 2079 │ 720605 │ ██████████████▍ │
|
||||
│ LONDON │ HARINGEY │ 3216 │ 717780 │ ██████████████▎ │
|
||||
│ PURLEY │ CROYDON │ 575 │ 716108 │ ██████████████▎ │
|
||||
│ WELWYN │ WELWYN HATFIELD │ 222 │ 710603 │ ██████████████▏ │
|
||||
│ RICKMANSWORTH │ THREE RIVERS │ 798 │ 704571 │ ██████████████ │
|
||||
│ BANSTEAD │ REIGATE AND BANSTEAD │ 401 │ 701293 │ ██████████████ │
|
||||
│ CHIGWELL │ EPPING FOREST │ 261 │ 701203 │ ██████████████ │
|
||||
│ PINNER │ HARROW │ 528 │ 698885 │ █████████████▊ │
|
||||
│ HASLEMERE │ WAVERLEY │ 280 │ 696659 │ █████████████▊ │
|
||||
│ SLOUGH │ BUCKINGHAMSHIRE │ 396 │ 694917 │ █████████████▊ │
|
||||
│ WALTON-ON-THAMES │ ELMBRIDGE │ 946 │ 692395 │ █████████████▋ │
|
||||
│ READING │ SOUTH OXFORDSHIRE │ 318 │ 691988 │ █████████████▋ │
|
||||
│ NORTHWOOD │ HILLINGDON │ 271 │ 690643 │ █████████████▋ │
|
||||
│ FELTHAM │ HOUNSLOW │ 763 │ 688595 │ █████████████▋ │
|
||||
│ ASHTEAD │ MOLE VALLEY │ 303 │ 687923 │ █████████████▋ │
|
||||
│ BARNET │ BARNET │ 975 │ 686980 │ █████████████▋ │
|
||||
│ WOKING │ SURREY HEATH │ 283 │ 686669 │ █████████████▋ │
|
||||
│ MALMESBURY │ WILTSHIRE │ 323 │ 683324 │ █████████████▋ │
|
||||
│ AMERSHAM │ BUCKINGHAMSHIRE │ 496 │ 680962 │ █████████████▌ │
|
||||
│ CHISLEHURST │ BROMLEY │ 430 │ 680209 │ █████████████▌ │
|
||||
│ HYTHE │ FOLKESTONE AND HYTHE │ 490 │ 676908 │ █████████████▌ │
|
||||
│ MAYFIELD │ WEALDEN │ 101 │ 676210 │ █████████████▌ │
|
||||
│ ASCOT │ BRACKNELL FOREST │ 168 │ 676004 │ █████████████▌ │
|
||||
└──────────────────────┴────────────────────────┴──────┴─────────┴────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Ускорьте запросы с помощью проекций {#speedup-with-projections}
|
||||
|
||||
[Проекции](../../sql-reference/statements/alter/projection.md) позволяют повысить скорость запросов за счет хранения предварительно агрегированных данных.
|
||||
|
||||
### Создайте проекцию {#build-projection}
|
||||
|
||||
Создайте агрегирующую проекцию по параметрам `toYear(date)`, `district`, `town`:
|
||||
|
||||
```sql
|
||||
ALTER TABLE uk_price_paid
|
||||
ADD PROJECTION projection_by_year_district_town
|
||||
(
|
||||
SELECT
|
||||
toYear(date),
|
||||
district,
|
||||
town,
|
||||
avg(price),
|
||||
sum(price),
|
||||
count()
|
||||
GROUP BY
|
||||
toYear(date),
|
||||
district,
|
||||
town
|
||||
);
|
||||
```
|
||||
|
||||
Заполните проекцию для текущих данных (иначе проекция будет создана только для добавляемых данных):
|
||||
|
||||
```sql
|
||||
ALTER TABLE uk_price_paid
|
||||
MATERIALIZE PROJECTION projection_by_year_district_town
|
||||
SETTINGS mutations_sync = 1;
|
||||
```
|
||||
|
||||
## Проверьте производительность {#test-performance}
|
||||
|
||||
Давайте выполним те же 3 запроса.
|
||||
|
||||
[Включите](../../operations/settings/settings.md#allow-experimental-projection-optimization) поддержку проекций:
|
||||
|
||||
```sql
|
||||
SET allow_experimental_projection_optimization = 1;
|
||||
```
|
||||
|
||||
### Запрос 1. Средняя цена за год {#average-price-projections}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
toYear(date) AS year,
|
||||
round(avg(price)) AS price,
|
||||
bar(price, 0, 1000000, 80)
|
||||
FROM uk_price_paid
|
||||
GROUP BY year
|
||||
ORDER BY year ASC;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─year─┬──price─┬─bar(round(avg(price)), 0, 1000000, 80)─┐
|
||||
│ 1995 │ 67932 │ █████▍ │
|
||||
│ 1996 │ 71505 │ █████▋ │
|
||||
│ 1997 │ 78532 │ ██████▎ │
|
||||
│ 1998 │ 85436 │ ██████▋ │
|
||||
│ 1999 │ 96037 │ ███████▋ │
|
||||
│ 2000 │ 107479 │ ████████▌ │
|
||||
│ 2001 │ 118885 │ █████████▌ │
|
||||
│ 2002 │ 137941 │ ███████████ │
|
||||
│ 2003 │ 155889 │ ████████████▍ │
|
||||
│ 2004 │ 178885 │ ██████████████▎ │
|
||||
│ 2005 │ 189351 │ ███████████████▏ │
|
||||
│ 2006 │ 203528 │ ████████████████▎ │
|
||||
│ 2007 │ 219378 │ █████████████████▌ │
|
||||
│ 2008 │ 217056 │ █████████████████▎ │
|
||||
│ 2009 │ 213419 │ █████████████████ │
|
||||
│ 2010 │ 236109 │ ██████████████████▊ │
|
||||
│ 2011 │ 232805 │ ██████████████████▌ │
|
||||
│ 2012 │ 238367 │ ███████████████████ │
|
||||
│ 2013 │ 256931 │ ████████████████████▌ │
|
||||
│ 2014 │ 279915 │ ██████████████████████▍ │
|
||||
│ 2015 │ 297266 │ ███████████████████████▋ │
|
||||
│ 2016 │ 313201 │ █████████████████████████ │
|
||||
│ 2017 │ 346097 │ ███████████████████████████▋ │
|
||||
│ 2018 │ 350116 │ ████████████████████████████ │
|
||||
│ 2019 │ 351013 │ ████████████████████████████ │
|
||||
│ 2020 │ 369420 │ █████████████████████████████▌ │
|
||||
│ 2021 │ 386903 │ ██████████████████████████████▊ │
|
||||
└──────┴────────┴────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Запрос 2. Средняя цена за год в Лондоне {#average-price-london-projections}
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
toYear(date) AS year,
|
||||
round(avg(price)) AS price,
|
||||
bar(price, 0, 2000000, 100)
|
||||
FROM uk_price_paid
|
||||
WHERE town = 'LONDON'
|
||||
GROUP BY year
|
||||
ORDER BY year ASC;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─year─┬───price─┬─bar(round(avg(price)), 0, 2000000, 100)───────────────┐
|
||||
│ 1995 │ 109116 │ █████▍ │
|
||||
│ 1996 │ 118667 │ █████▊ │
|
||||
│ 1997 │ 136518 │ ██████▋ │
|
||||
│ 1998 │ 152983 │ ███████▋ │
|
||||
│ 1999 │ 180637 │ █████████ │
|
||||
│ 2000 │ 215838 │ ██████████▋ │
|
||||
│ 2001 │ 232994 │ ███████████▋ │
|
||||
│ 2002 │ 263670 │ █████████████▏ │
|
||||
│ 2003 │ 278394 │ █████████████▊ │
|
||||
│ 2004 │ 304666 │ ███████████████▏ │
|
||||
│ 2005 │ 322875 │ ████████████████▏ │
|
||||
│ 2006 │ 356191 │ █████████████████▋ │
|
||||
│ 2007 │ 404054 │ ████████████████████▏ │
|
||||
│ 2008 │ 420741 │ █████████████████████ │
|
||||
│ 2009 │ 427753 │ █████████████████████▍ │
|
||||
│ 2010 │ 480306 │ ████████████████████████ │
|
||||
│ 2011 │ 496274 │ ████████████████████████▋ │
|
||||
│ 2012 │ 519442 │ █████████████████████████▊ │
|
||||
│ 2013 │ 616212 │ ██████████████████████████████▋ │
|
||||
│ 2014 │ 724154 │ ████████████████████████████████████▏ │
|
||||
│ 2015 │ 792129 │ ███████████████████████████████████████▌ │
|
||||
│ 2016 │ 843655 │ ██████████████████████████████████████████▏ │
|
||||
│ 2017 │ 982642 │ █████████████████████████████████████████████████▏ │
|
||||
│ 2018 │ 1016835 │ ██████████████████████████████████████████████████▋ │
|
||||
│ 2019 │ 1042849 │ ████████████████████████████████████████████████████▏ │
|
||||
│ 2020 │ 1011889 │ ██████████████████████████████████████████████████▌ │
|
||||
│ 2021 │ 960343 │ ████████████████████████████████████████████████ │
|
||||
└──────┴─────────┴───────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Запрос 3. Самые дорогие районы {#most-expensive-neighborhoods-projections}
|
||||
|
||||
Условие (date >= '2020-01-01') необходимо изменить, чтобы оно соответствовало проекции (toYear(date) >= 2020).
|
||||
|
||||
Запрос:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
town,
|
||||
district,
|
||||
count() AS c,
|
||||
round(avg(price)) AS price,
|
||||
bar(price, 0, 5000000, 100)
|
||||
FROM uk_price_paid
|
||||
WHERE toYear(date) >= 2020
|
||||
GROUP BY
|
||||
town,
|
||||
district
|
||||
HAVING c >= 100
|
||||
ORDER BY price DESC
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
```text
|
||||
┌─town─────────────────┬─district───────────────┬────c─┬───price─┬─bar(round(avg(price)), 0, 5000000, 100)────────────────────────────┐
|
||||
│ LONDON │ CITY OF WESTMINSTER │ 3606 │ 3280239 │ █████████████████████████████████████████████████████████████████▌ │
|
||||
│ LONDON │ CITY OF LONDON │ 274 │ 3160502 │ ███████████████████████████████████████████████████████████████▏ │
|
||||
│ LONDON │ KENSINGTON AND CHELSEA │ 2550 │ 2308478 │ ██████████████████████████████████████████████▏ │
|
||||
│ LEATHERHEAD │ ELMBRIDGE │ 114 │ 1897407 │ █████████████████████████████████████▊ │
|
||||
│ LONDON │ CAMDEN │ 3033 │ 1805404 │ ████████████████████████████████████ │
|
||||
│ VIRGINIA WATER │ RUNNYMEDE │ 156 │ 1753247 │ ███████████████████████████████████ │
|
||||
│ WINDLESHAM │ SURREY HEATH │ 108 │ 1677613 │ █████████████████████████████████▌ │
|
||||
│ THORNTON HEATH │ CROYDON │ 546 │ 1671721 │ █████████████████████████████████▍ │
|
||||
│ BARNET │ ENFIELD │ 124 │ 1505840 │ ██████████████████████████████ │
|
||||
│ COBHAM │ ELMBRIDGE │ 387 │ 1237250 │ ████████████████████████▋ │
|
||||
│ LONDON │ ISLINGTON │ 2668 │ 1236980 │ ████████████████████████▋ │
|
||||
│ OXFORD │ SOUTH OXFORDSHIRE │ 321 │ 1220907 │ ████████████████████████▍ │
|
||||
│ LONDON │ RICHMOND UPON THAMES │ 704 │ 1215551 │ ████████████████████████▎ │
|
||||
│ LONDON │ HOUNSLOW │ 671 │ 1207493 │ ████████████████████████▏ │
|
||||
│ ASCOT │ WINDSOR AND MAIDENHEAD │ 407 │ 1183299 │ ███████████████████████▋ │
|
||||
│ BEACONSFIELD │ BUCKINGHAMSHIRE │ 330 │ 1175615 │ ███████████████████████▌ │
|
||||
│ RICHMOND │ RICHMOND UPON THAMES │ 874 │ 1110444 │ ██████████████████████▏ │
|
||||
│ LONDON │ HAMMERSMITH AND FULHAM │ 3086 │ 1053983 │ █████████████████████ │
|
||||
│ SURBITON │ ELMBRIDGE │ 100 │ 1011800 │ ████████████████████▏ │
|
||||
│ RADLETT │ HERTSMERE │ 283 │ 1011712 │ ████████████████████▏ │
|
||||
│ SALCOMBE │ SOUTH HAMS │ 127 │ 1011624 │ ████████████████████▏ │
|
||||
│ WEYBRIDGE │ ELMBRIDGE │ 655 │ 1007265 │ ████████████████████▏ │
|
||||
│ ESHER │ ELMBRIDGE │ 485 │ 986581 │ ███████████████████▋ │
|
||||
│ LEATHERHEAD │ GUILDFORD │ 202 │ 977320 │ ███████████████████▌ │
|
||||
│ BURFORD │ WEST OXFORDSHIRE │ 111 │ 966893 │ ███████████████████▎ │
|
||||
│ BROCKENHURST │ NEW FOREST │ 129 │ 956675 │ ███████████████████▏ │
|
||||
│ HINDHEAD │ WAVERLEY │ 137 │ 953753 │ ███████████████████ │
|
||||
│ GERRARDS CROSS │ BUCKINGHAMSHIRE │ 419 │ 951121 │ ███████████████████ │
|
||||
│ EAST MOLESEY │ ELMBRIDGE │ 192 │ 936769 │ ██████████████████▋ │
|
||||
│ CHALFONT ST GILES │ BUCKINGHAMSHIRE │ 146 │ 925515 │ ██████████████████▌ │
|
||||
│ LONDON │ TOWER HAMLETS │ 4388 │ 918304 │ ██████████████████▎ │
|
||||
│ OLNEY │ MILTON KEYNES │ 235 │ 910646 │ ██████████████████▏ │
|
||||
│ HENLEY-ON-THAMES │ SOUTH OXFORDSHIRE │ 540 │ 902418 │ ██████████████████ │
|
||||
│ LONDON │ SOUTHWARK │ 3885 │ 892997 │ █████████████████▋ │
|
||||
│ KINGSTON UPON THAMES │ KINGSTON UPON THAMES │ 960 │ 885969 │ █████████████████▋ │
|
||||
│ LONDON │ EALING │ 2658 │ 871755 │ █████████████████▍ │
|
||||
│ CRANBROOK │ TUNBRIDGE WELLS │ 431 │ 862348 │ █████████████████▏ │
|
||||
│ LONDON │ MERTON │ 2099 │ 859118 │ █████████████████▏ │
|
||||
│ BELVEDERE │ BEXLEY │ 346 │ 842423 │ ████████████████▋ │
|
||||
│ GUILDFORD │ WAVERLEY │ 143 │ 841277 │ ████████████████▋ │
|
||||
│ HARPENDEN │ ST ALBANS │ 657 │ 841216 │ ████████████████▋ │
|
||||
│ LONDON │ HACKNEY │ 3307 │ 837090 │ ████████████████▋ │
|
||||
│ LONDON │ WANDSWORTH │ 6566 │ 832663 │ ████████████████▋ │
|
||||
│ MAIDENHEAD │ BUCKINGHAMSHIRE │ 123 │ 824299 │ ████████████████▍ │
|
||||
│ KINGS LANGLEY │ DACORUM │ 145 │ 821331 │ ████████████████▍ │
|
||||
│ BERKHAMSTED │ DACORUM │ 543 │ 818415 │ ████████████████▎ │
|
||||
│ GREAT MISSENDEN │ BUCKINGHAMSHIRE │ 226 │ 802807 │ ████████████████ │
|
||||
│ BILLINGSHURST │ CHICHESTER │ 144 │ 797829 │ ███████████████▊ │
|
||||
│ WOKING │ GUILDFORD │ 176 │ 793494 │ ███████████████▋ │
|
||||
│ STOCKBRIDGE │ TEST VALLEY │ 178 │ 793269 │ ███████████████▋ │
|
||||
│ EPSOM │ REIGATE AND BANSTEAD │ 172 │ 791862 │ ███████████████▋ │
|
||||
│ TONBRIDGE │ TUNBRIDGE WELLS │ 360 │ 787876 │ ███████████████▋ │
|
||||
│ TEDDINGTON │ RICHMOND UPON THAMES │ 595 │ 786492 │ ███████████████▋ │
|
||||
│ TWICKENHAM │ RICHMOND UPON THAMES │ 1155 │ 786193 │ ███████████████▋ │
|
||||
│ LYNDHURST │ NEW FOREST │ 102 │ 785593 │ ███████████████▋ │
|
||||
│ LONDON │ LAMBETH │ 5228 │ 774574 │ ███████████████▍ │
|
||||
│ LONDON │ BARNET │ 3955 │ 773259 │ ███████████████▍ │
|
||||
│ OXFORD │ VALE OF WHITE HORSE │ 353 │ 772088 │ ███████████████▍ │
|
||||
│ TONBRIDGE │ MAIDSTONE │ 305 │ 770740 │ ███████████████▍ │
|
||||
│ LUTTERWORTH │ HARBOROUGH │ 538 │ 768634 │ ███████████████▎ │
|
||||
│ WOODSTOCK │ WEST OXFORDSHIRE │ 140 │ 766037 │ ███████████████▎ │
|
||||
│ MIDHURST │ CHICHESTER │ 257 │ 764815 │ ███████████████▎ │
|
||||
│ MARLOW │ BUCKINGHAMSHIRE │ 327 │ 761876 │ ███████████████▏ │
|
||||
│ LONDON │ NEWHAM │ 3237 │ 761784 │ ███████████████▏ │
|
||||
│ ALDERLEY EDGE │ CHESHIRE EAST │ 178 │ 757318 │ ███████████████▏ │
|
||||
│ LUTON │ CENTRAL BEDFORDSHIRE │ 212 │ 754283 │ ███████████████ │
|
||||
│ PETWORTH │ CHICHESTER │ 154 │ 754220 │ ███████████████ │
|
||||
│ ALRESFORD │ WINCHESTER │ 219 │ 752718 │ ███████████████ │
|
||||
│ POTTERS BAR │ WELWYN HATFIELD │ 174 │ 748465 │ ██████████████▊ │
|
||||
│ HASLEMERE │ CHICHESTER │ 128 │ 746907 │ ██████████████▊ │
|
||||
│ TADWORTH │ REIGATE AND BANSTEAD │ 502 │ 743252 │ ██████████████▋ │
|
||||
│ THAMES DITTON │ ELMBRIDGE │ 244 │ 741913 │ ██████████████▋ │
|
||||
│ REIGATE │ REIGATE AND BANSTEAD │ 581 │ 738198 │ ██████████████▋ │
|
||||
│ BOURNE END │ BUCKINGHAMSHIRE │ 138 │ 735190 │ ██████████████▋ │
|
||||
│ SEVENOAKS │ SEVENOAKS │ 1156 │ 730018 │ ██████████████▌ │
|
||||
│ OXTED │ TANDRIDGE │ 336 │ 729123 │ ██████████████▌ │
|
||||
│ INGATESTONE │ BRENTWOOD │ 166 │ 728103 │ ██████████████▌ │
|
||||
│ LONDON │ BRENT │ 2079 │ 720605 │ ██████████████▍ │
|
||||
│ LONDON │ HARINGEY │ 3216 │ 717780 │ ██████████████▎ │
|
||||
│ PURLEY │ CROYDON │ 575 │ 716108 │ ██████████████▎ │
|
||||
│ WELWYN │ WELWYN HATFIELD │ 222 │ 710603 │ ██████████████▏ │
|
||||
│ RICKMANSWORTH │ THREE RIVERS │ 798 │ 704571 │ ██████████████ │
|
||||
│ BANSTEAD │ REIGATE AND BANSTEAD │ 401 │ 701293 │ ██████████████ │
|
||||
│ CHIGWELL │ EPPING FOREST │ 261 │ 701203 │ ██████████████ │
|
||||
│ PINNER │ HARROW │ 528 │ 698885 │ █████████████▊ │
|
||||
│ HASLEMERE │ WAVERLEY │ 280 │ 696659 │ █████████████▊ │
|
||||
│ SLOUGH │ BUCKINGHAMSHIRE │ 396 │ 694917 │ █████████████▊ │
|
||||
│ WALTON-ON-THAMES │ ELMBRIDGE │ 946 │ 692395 │ █████████████▋ │
|
||||
│ READING │ SOUTH OXFORDSHIRE │ 318 │ 691988 │ █████████████▋ │
|
||||
│ NORTHWOOD │ HILLINGDON │ 271 │ 690643 │ █████████████▋ │
|
||||
│ FELTHAM │ HOUNSLOW │ 763 │ 688595 │ █████████████▋ │
|
||||
│ ASHTEAD │ MOLE VALLEY │ 303 │ 687923 │ █████████████▋ │
|
||||
│ BARNET │ BARNET │ 975 │ 686980 │ █████████████▋ │
|
||||
│ WOKING │ SURREY HEATH │ 283 │ 686669 │ █████████████▋ │
|
||||
│ MALMESBURY │ WILTSHIRE │ 323 │ 683324 │ █████████████▋ │
|
||||
│ AMERSHAM │ BUCKINGHAMSHIRE │ 496 │ 680962 │ █████████████▌ │
|
||||
│ CHISLEHURST │ BROMLEY │ 430 │ 680209 │ █████████████▌ │
|
||||
│ HYTHE │ FOLKESTONE AND HYTHE │ 490 │ 676908 │ █████████████▌ │
|
||||
│ MAYFIELD │ WEALDEN │ 101 │ 676210 │ █████████████▌ │
|
||||
│ ASCOT │ BRACKNELL FOREST │ 168 │ 676004 │ █████████████▌ │
|
||||
└──────────────────────┴────────────────────────┴──────┴─────────┴────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Резюме {#summary}
|
||||
|
||||
Все три запроса работают намного быстрее и читают меньшее количество строк.
|
||||
|
||||
```text
|
||||
Query 1
|
||||
|
||||
no projection: 27 rows in set. Elapsed: 0.158 sec. Processed 26.32 million rows, 157.93 MB (166.57 million rows/s., 999.39 MB/s.)
|
||||
projection: 27 rows in set. Elapsed: 0.007 sec. Processed 105.96 thousand rows, 3.33 MB (14.58 million rows/s., 458.13 MB/s.)
|
||||
|
||||
|
||||
Query 2
|
||||
|
||||
no projection: 27 rows in set. Elapsed: 0.163 sec. Processed 26.32 million rows, 80.01 MB (161.75 million rows/s., 491.64 MB/s.)
|
||||
projection: 27 rows in set. Elapsed: 0.008 sec. Processed 105.96 thousand rows, 3.67 MB (13.29 million rows/s., 459.89 MB/s.)
|
||||
|
||||
Query 3
|
||||
|
||||
no projection: 100 rows in set. Elapsed: 0.069 sec. Processed 26.32 million rows, 62.47 MB (382.13 million rows/s., 906.93 MB/s.)
|
||||
projection: 100 rows in set. Elapsed: 0.029 sec. Processed 8.08 thousand rows, 511.08 KB (276.06 thousand rows/s., 17.47 MB/s.)
|
||||
```
|
||||
|
||||
### Online Playground {#playground}
|
||||
|
||||
Этот набор данных доступен в [Online Playground](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIHRvd24sIGRpc3RyaWN0LCBjb3VudCgpIEFTIGMsIHJvdW5kKGF2ZyhwcmljZSkpIEFTIHByaWNlLCBiYXIocHJpY2UsIDAsIDUwMDAwMDAsIDEwMCkgRlJPTSB1a19wcmljZV9wYWlkIFdIRVJFIGRhdGUgPj0gJzIwMjAtMDEtMDEnIEdST1VQIEJZIHRvd24sIGRpc3RyaWN0IEhBVklORyBjID49IDEwMCBPUkRFUiBCWSBwcmljZSBERVNDIExJTUlUIDEwMA==).
|
||||
|
@ -27,11 +27,11 @@ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not su
|
||||
{% include 'install/deb.sh' %}
|
||||
```
|
||||
|
||||
Также эти пакеты можно скачать и установить вручную отсюда: https://repo.clickhouse.tech/deb/stable/main/.
|
||||
Также эти пакеты можно скачать и установить вручную отсюда: https://repo.clickhouse.com/deb/stable/main/.
|
||||
|
||||
Если вы хотите использовать наиболее свежую версию, замените `stable` на `testing` (рекомендуется для тестовых окружений).
|
||||
|
||||
Также вы можете вручную скачать и установить пакеты из [репозитория](https://repo.clickhouse.tech/deb/stable/main/).
|
||||
Также вы можете вручную скачать и установить пакеты из [репозитория](https://repo.clickhouse.com/deb/stable/main/).
|
||||
|
||||
#### Пакеты {#packages}
|
||||
|
||||
@ -52,8 +52,8 @@ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not su
|
||||
|
||||
``` bash
|
||||
sudo yum install yum-utils
|
||||
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
|
||||
sudo rpm --import https://repo.clickhouse.com/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.com/rpm/stable/x86_64
|
||||
```
|
||||
|
||||
Для использования наиболее свежих версий нужно заменить `stable` на `testing` (рекомендуется для тестовых окружений). Также иногда доступен `prestable`.
|
||||
@ -64,21 +64,21 @@ sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_6
|
||||
sudo yum install clickhouse-server clickhouse-client
|
||||
```
|
||||
|
||||
Также есть возможность установить пакеты вручную, скачав отсюда: https://repo.clickhouse.tech/rpm/stable/x86_64.
|
||||
Также есть возможность установить пакеты вручную, скачав отсюда: https://repo.clickhouse.com/rpm/stable/x86_64.
|
||||
|
||||
### Из Tgz архивов {#from-tgz-archives}
|
||||
|
||||
Команда ClickHouse в Яндексе рекомендует использовать предкомпилированные бинарники из `tgz` архивов для всех дистрибутивов, где невозможна установка `deb` и `rpm` пакетов.
|
||||
|
||||
Интересующую версию архивов можно скачать вручную с помощью `curl` или `wget` из репозитория https://repo.clickhouse.tech/tgz/.
|
||||
Интересующую версию архивов можно скачать вручную с помощью `curl` или `wget` из репозитория https://repo.clickhouse.com/tgz/.
|
||||
После этого архивы нужно распаковать и воспользоваться скриптами установки. Пример установки самой свежей версии:
|
||||
|
||||
``` bash
|
||||
export LATEST_VERSION=`curl https://api.github.com/repos/ClickHouse/ClickHouse/tags 2>/dev/null | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -n 1`
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
|
||||
tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh
|
||||
|
@ -364,7 +364,7 @@ $ clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FOR
|
||||
|
||||
## CSVWithNames {#csvwithnames}
|
||||
|
||||
Выводит также заголовок, аналогично `TabSeparatedWithNames`.
|
||||
Выводит также заголовок, аналогично [TabSeparatedWithNames](#tabseparatedwithnames).
|
||||
|
||||
## CustomSeparated {#format-customseparated}
|
||||
|
||||
|
@ -7,7 +7,8 @@ toc_title: DateTime64
|
||||
|
||||
Позволяет хранить момент времени, который может быть представлен как календарная дата и время, с заданной суб-секундной точностью.
|
||||
|
||||
Размер тика (точность, precision): 10<sup>-precision</sup> секунд, где precision - целочисленный параметр.
|
||||
Размер тика (точность, precision): 10<sup>-precision</sup> секунд, где precision - целочисленный параметр. Возможные значения: [ 0 : 9 ].
|
||||
Обычно используются - 3 (миллисекунды), 6 (микросекунды), 9 (наносекунды).
|
||||
|
||||
**Синтаксис:**
|
||||
|
||||
|
@ -58,9 +58,68 @@ str -> str != Referer
|
||||
|
||||
Для некоторых функций первый аргумент (лямбда-функция) может отсутствовать. В этом случае подразумевается тождественное отображение.
|
||||
|
||||
## Пользовательские функции {#user-defined-functions}
|
||||
## Пользовательские функции SQL {#user-defined-functions}
|
||||
|
||||
Функции можно создавать с помощью выражения [CREATE FUNCTION](../statements/create/function.md). Для удаления таких функций используется выражение [DROP FUNCTION](../statements/drop.md#drop-function).
|
||||
Функции можно создавать из лямбда выражений с помощью [CREATE FUNCTION](../statements/create/function.md). Для удаления таких функций используется выражение [DROP FUNCTION](../statements/drop.md#drop-function).
|
||||
|
||||
## Исполняемые пользовательские функции {#executable-user-defined-functions}
|
||||
ClickHouse может вызывать внешнюю программу или скрипт для обработки данных. Такие функции описываются в [конфигурационном файле](../../operations/configuration-files.md). Путь к нему должен быть указан в настройке `user_defined_executable_functions_config` в основной конфигурации. В пути можно использовать символ подстановки `*`, тогда будут загружены все файлы, соответствующие шаблону. Пример:
|
||||
``` xml
|
||||
<user_defined_executable_functions_config>*_function.xml</user_defined_executable_functions_config>
|
||||
```
|
||||
Файлы с описанием функций ищутся относительно каталога, заданного в настройке `user_files_path`.
|
||||
|
||||
Конфигурация функции содержит следующие настройки:
|
||||
|
||||
- `name` - имя функции.
|
||||
- `command` - исполняемая команда или скрипт.
|
||||
- `argument` - описание аргумента, содержащее его тип во вложенной настройке `type`. Каждый аргумент описывается отдельно.
|
||||
- `format` - [формат](../../interfaces/formats.md) передачи аргументов.
|
||||
- `return_type` - тип возвращаемого значения.
|
||||
- `type` - вариант запуска команды. Если задан вариант `executable`, то запускается одна команда. При указании `executable_pool` создается пул команд.
|
||||
- `max_command_execution_time` - максимальное время в секундах, которое отводится на обработку блока данных. Эта настройка применима только для команд с вариантом запуска `executable_pool`. Необязательная настройка. Значение по умолчанию `10`.
|
||||
- `command_termination_timeout` - максимальное время завершения команды в секундах после закрытия конвейера. Если команда не завершается, то процессу отправляется сигнал `SIGTERM`. Эта настройка применима только для команд с вариантом запуска `executable_pool`. Необязательная настройка. Значение по умолчанию `10`.
|
||||
- `pool_size` - размер пула команд. Необязательная настройка. Значение по умолчанию `16`.
|
||||
- `lifetime` - интервал перезагрузки функций в секундах. Если задан `0`, то функция не перезагружается.
|
||||
- `send_chunk_header` - управляет отправкой количества строк перед отправкой блока данных для обработки. Необязательная настройка. Значение по умолчанию `false`.
|
||||
|
||||
Команда должна читать аргументы из `STDIN` и выводить результат в `STDOUT`. Обработка должна выполняться в цикле. То есть после обработки группы аргументов команда должна ожидать следующую группу.
|
||||
|
||||
**Пример**
|
||||
|
||||
XML конфигурация, описывающая функцию `test_function`:
|
||||
```
|
||||
<functions>
|
||||
<function>
|
||||
<type>executable</type>
|
||||
<name>test_function</name>
|
||||
<return_type>UInt64</return_type>
|
||||
<argument>
|
||||
<type>UInt64</type>
|
||||
</argument>
|
||||
<argument>
|
||||
<type>UInt64</type>
|
||||
</argument>
|
||||
<format>TabSeparated</format>
|
||||
<command>cd /; clickhouse-local --input-format TabSeparated --output-format TabSeparated --structure 'x UInt64, y UInt64' --query "SELECT x + y FROM table"</command>
|
||||
<lifetime>0</lifetime>
|
||||
</function>
|
||||
</functions>
|
||||
```
|
||||
|
||||
Запрос:
|
||||
|
||||
``` sql
|
||||
SELECT test_function(toUInt64(2), toUInt64(2));
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` text
|
||||
┌─test_function(toUInt64(2), toUInt64(2))─┐
|
||||
│ 4 │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Обработка ошибок {#obrabotka-oshibok}
|
||||
|
||||
|
@ -157,6 +157,8 @@ GRANT SELECT(x,y) ON db.table TO john WITH GRANT OPTION
|
||||
- `SYSTEM RELOAD CONFIG`
|
||||
- `SYSTEM RELOAD DICTIONARY`
|
||||
- `SYSTEM RELOAD EMBEDDED DICTIONARIES`
|
||||
- `SYSTEM RELOAD FUNCTION`
|
||||
- `SYSTEM RELOAD FUNCTIONS`
|
||||
- `SYSTEM MERGES`
|
||||
- `SYSTEM TTL MERGES`
|
||||
- `SYSTEM FETCHES`
|
||||
|
@ -10,6 +10,8 @@ toc_title: SYSTEM
|
||||
- [RELOAD DICTIONARY](#query_language-system-reload-dictionary)
|
||||
- [RELOAD MODELS](#query_language-system-reload-models)
|
||||
- [RELOAD MODEL](#query_language-system-reload-model)
|
||||
- [RELOAD FUNCTIONS](#query_language-system-reload-functions)
|
||||
- [RELOAD FUNCTION](#query_language-system-reload-functions)
|
||||
- [DROP DNS CACHE](#query_language-system-drop-dns-cache)
|
||||
- [DROP MARK CACHE](#query_language-system-drop-mark-cache)
|
||||
- [DROP UNCOMPRESSED CACHE](#query_language-system-drop-uncompressed-cache)
|
||||
@ -80,6 +82,17 @@ SYSTEM RELOAD MODELS
|
||||
SYSTEM RELOAD MODEL <model_name>
|
||||
```
|
||||
|
||||
## RELOAD FUNCTIONS {#query_language-system-reload-functions}
|
||||
|
||||
Перезагружает все зарегистрированные [исполняемые пользовательские функции](../functions/index.md#executable-user-defined-functions) или одну из них из файла конфигурации.
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
```sql
|
||||
RELOAD FUNCTIONS
|
||||
RELOAD FUNCTION function_name
|
||||
```
|
||||
|
||||
## DROP DNS CACHE {#query_language-system-drop-dns-cache}
|
||||
|
||||
Сбрасывает внутренний DNS кеш ClickHouse. Иногда (для старых версий ClickHouse) необходимо использовать эту команду при изменении инфраструктуры (смене IP адреса у другого ClickHouse сервера или сервера, используемого словарями).
|
||||
|
@ -4,8 +4,8 @@ set -ex
|
||||
BASE_DIR=$(dirname $(readlink -f $0))
|
||||
BUILD_DIR="${BASE_DIR}/../build"
|
||||
PUBLISH_DIR="${BASE_DIR}/../publish"
|
||||
BASE_DOMAIN="${BASE_DOMAIN:-content.clickhouse.tech}"
|
||||
GIT_TEST_URI="${GIT_TEST_URI:-git@github.com:ClickHouse/clickhouse-website-content.git}"
|
||||
BASE_DOMAIN="${BASE_DOMAIN:-content.clickhouse.com}"
|
||||
GIT_TEST_URI="${GIT_TEST_URI:-git@github.com:ClickHouse/clickhouse-com-content.git}"
|
||||
GIT_PROD_URI="git@github.com:ClickHouse/clickhouse-website-content.git"
|
||||
EXTRA_BUILD_ARGS="${EXTRA_BUILD_ARGS:---minify --verbose}"
|
||||
|
||||
|
@ -29,7 +29,7 @@ $ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not
|
||||
|
||||
如果您想使用最新的版本,请用`testing`替代`stable`(我们只推荐您用于测试环境)。
|
||||
|
||||
你也可以从这里手动下载安装包:[下载](https://repo.clickhouse.tech/deb/stable/main/)。
|
||||
你也可以从这里手动下载安装包:[下载](https://repo.clickhouse.com/deb/stable/main/)。
|
||||
|
||||
安装包列表:
|
||||
|
||||
@ -46,8 +46,8 @@ $ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not
|
||||
|
||||
``` bash
|
||||
sudo yum install yum-utils
|
||||
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
|
||||
sudo rpm --import https://repo.clickhouse.com/CLICKHOUSE-KEY.GPG
|
||||
sudo yum-config-manager --add-repo https://repo.clickhouse.com/rpm/stable/x86_64
|
||||
```
|
||||
|
||||
如果您想使用最新的版本,请用`testing`替代`stable`(我们只推荐您用于测试环境)。`prestable`有时也可用。
|
||||
@ -58,22 +58,22 @@ sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_6
|
||||
sudo yum install clickhouse-server clickhouse-client
|
||||
```
|
||||
|
||||
你也可以从这里手动下载安装包:[下载](https://repo.clickhouse.tech/rpm/stable/x86_64)。
|
||||
你也可以从这里手动下载安装包:[下载](https://repo.clickhouse.com/rpm/stable/x86_64)。
|
||||
|
||||
### `Tgz`安装包 {#from-tgz-archives}
|
||||
|
||||
如果您的操作系统不支持安装`deb`或`rpm`包,建议使用官方预编译的`tgz`软件包。
|
||||
|
||||
所需的版本可以通过`curl`或`wget`从存储库`https://repo.clickhouse.tech/tgz/`下载。
|
||||
所需的版本可以通过`curl`或`wget`从存储库`https://repo.clickhouse.com/tgz/`下载。
|
||||
|
||||
下载后解压缩下载资源文件并使用安装脚本进行安装。以下是一个最新版本的安装示例:
|
||||
|
||||
``` bash
|
||||
export LATEST_VERSION=`curl https://api.github.com/repos/ClickHouse/ClickHouse/tags 2>/dev/null | grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -n 1`
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.tech/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-server-$LATEST_VERSION.tgz
|
||||
curl -O https://repo.clickhouse.com/tgz/clickhouse-client-$LATEST_VERSION.tgz
|
||||
|
||||
tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
|
||||
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh
|
||||
|
@ -34,7 +34,7 @@
|
||||
#include <Poco/Util/Application.h>
|
||||
#include <Processors/Formats/IInputFormat.h>
|
||||
#include <Processors/Executors/PullingAsyncPipelineExecutor.h>
|
||||
#include <Processors/QueryPipeline.h>
|
||||
#include <Processors/QueryPipelineBuilder.h>
|
||||
#include <Columns/ColumnString.h>
|
||||
#include <common/find_symbols.h>
|
||||
#include <common/LineReader.h>
|
||||
@ -2050,8 +2050,7 @@ private:
|
||||
});
|
||||
}
|
||||
|
||||
QueryPipeline pipeline;
|
||||
pipeline.init(std::move(pipe));
|
||||
QueryPipeline pipeline(std::move(pipe));
|
||||
PullingAsyncPipelineExecutor executor(pipeline);
|
||||
|
||||
Block block;
|
||||
|
@ -9,6 +9,11 @@
|
||||
#include <IO/ConnectionTimeoutsContext.h>
|
||||
#include <Interpreters/InterpreterInsertQuery.h>
|
||||
#include <Processors/Transforms/ExpressionTransform.h>
|
||||
#include <Processors/QueryPipelineBuilder.h>
|
||||
#include <Processors/Chain.h>
|
||||
#include <Processors/Executors/PullingPipelineExecutor.h>
|
||||
#include <Processors/Executors/PushingPipelineExecutor.h>
|
||||
#include <Processors/Sources/RemoteSource.h>
|
||||
#include <DataStreams/ExpressionBlockInputStream.h>
|
||||
|
||||
namespace DB
|
||||
@ -1446,7 +1451,7 @@ TaskStatus ClusterCopier::processPartitionPieceTaskImpl(
|
||||
local_context->setSettings(task_cluster->settings_pull);
|
||||
local_context->setSetting("skip_unavailable_shards", true);
|
||||
|
||||
Block block = getBlockWithAllStreamData(InterpreterFactory::get(query_select_ast, local_context)->execute().getInputStream());
|
||||
Block block = getBlockWithAllStreamData(InterpreterFactory::get(query_select_ast, local_context)->execute().pipeline);
|
||||
count = (block) ? block.safeGetByPosition(0).column->getUInt(0) : 0;
|
||||
}
|
||||
|
||||
@ -1524,25 +1529,30 @@ TaskStatus ClusterCopier::processPartitionPieceTaskImpl(
|
||||
context_insert->setSettings(task_cluster->settings_push);
|
||||
|
||||
/// Custom INSERT SELECT implementation
|
||||
BlockInputStreamPtr input;
|
||||
BlockOutputStreamPtr output;
|
||||
QueryPipeline input;
|
||||
QueryPipeline output;
|
||||
{
|
||||
BlockIO io_select = InterpreterFactory::get(query_select_ast, context_select)->execute();
|
||||
BlockIO io_insert = InterpreterFactory::get(query_insert_ast, context_insert)->execute();
|
||||
|
||||
auto pure_input = io_select.getInputStream();
|
||||
output = io_insert.out;
|
||||
output = std::move(io_insert.pipeline);
|
||||
|
||||
/// Add converting actions to make it possible to copy blocks with slightly different schema
|
||||
const auto & select_block = pure_input->getHeader();
|
||||
const auto & insert_block = output->getHeader();
|
||||
const auto & select_block = io_select.pipeline.getHeader();
|
||||
const auto & insert_block = output.getHeader();
|
||||
auto actions_dag = ActionsDAG::makeConvertingActions(
|
||||
select_block.getColumnsWithTypeAndName(),
|
||||
insert_block.getColumnsWithTypeAndName(),
|
||||
ActionsDAG::MatchColumnsMode::Position);
|
||||
auto actions = std::make_shared<ExpressionActions>(actions_dag, ExpressionActionsSettings::fromContext(getContext()));
|
||||
|
||||
input = std::make_shared<ExpressionBlockInputStream>(pure_input, actions);
|
||||
QueryPipelineBuilder builder;
|
||||
builder.init(std::move(io_select.pipeline));
|
||||
builder.addSimpleTransform([&](const Block & header)
|
||||
{
|
||||
return std::make_shared<ExpressionTransform>(header, actions);
|
||||
});
|
||||
input = QueryPipelineBuilder::getPipeline(std::move(builder));
|
||||
}
|
||||
|
||||
/// Fail-fast optimization to abort copying when the current clean state expires
|
||||
@ -1588,7 +1598,26 @@ TaskStatus ClusterCopier::processPartitionPieceTaskImpl(
|
||||
};
|
||||
|
||||
/// Main work is here
|
||||
copyData(*input, *output, cancel_check, update_stats);
|
||||
PullingPipelineExecutor pulling_executor(input);
|
||||
PushingPipelineExecutor pushing_executor(output);
|
||||
|
||||
Block data;
|
||||
bool is_cancelled = false;
|
||||
while (pulling_executor.pull(data))
|
||||
{
|
||||
if (cancel_check())
|
||||
{
|
||||
is_cancelled = true;
|
||||
pushing_executor.cancel();
|
||||
pushing_executor.cancel();
|
||||
break;
|
||||
}
|
||||
pushing_executor.push(data);
|
||||
update_stats(data);
|
||||
}
|
||||
|
||||
if (!is_cancelled)
|
||||
pushing_executor.finish();
|
||||
|
||||
// Just in case
|
||||
if (future_is_dirty_checker.valid())
|
||||
@ -1711,7 +1740,8 @@ String ClusterCopier::getRemoteCreateTable(
|
||||
|
||||
String query = "SHOW CREATE TABLE " + getQuotedTable(table);
|
||||
Block block = getBlockWithAllStreamData(
|
||||
std::make_shared<RemoteBlockInputStream>(connection, query, InterpreterShowCreateQuery::getSampleBlock(), remote_context));
|
||||
QueryPipeline(std::make_shared<RemoteSource>(
|
||||
std::make_shared<RemoteQueryExecutor>(connection, query, InterpreterShowCreateQuery::getSampleBlock(), remote_context), false, false)));
|
||||
|
||||
return typeid_cast<const ColumnString &>(*block.safeGetByPosition(0).column).getDataAt(0).toString();
|
||||
}
|
||||
@ -1824,7 +1854,7 @@ std::set<String> ClusterCopier::getShardPartitions(const ConnectionTimeouts & ti
|
||||
|
||||
auto local_context = Context::createCopy(context);
|
||||
local_context->setSettings(task_cluster->settings_pull);
|
||||
Block block = getBlockWithAllStreamData(InterpreterFactory::get(query_ast, local_context)->execute().getInputStream());
|
||||
Block block = getBlockWithAllStreamData(InterpreterFactory::get(query_ast, local_context)->execute().pipeline);
|
||||
|
||||
if (block)
|
||||
{
|
||||
@ -1869,7 +1899,11 @@ bool ClusterCopier::checkShardHasPartition(const ConnectionTimeouts & timeouts,
|
||||
|
||||
auto local_context = Context::createCopy(context);
|
||||
local_context->setSettings(task_cluster->settings_pull);
|
||||
return InterpreterFactory::get(query_ast, local_context)->execute().getInputStream()->read().rows() != 0;
|
||||
auto pipeline = InterpreterFactory::get(query_ast, local_context)->execute().pipeline;
|
||||
PullingPipelineExecutor executor(pipeline);
|
||||
Block block;
|
||||
executor.pull(block);
|
||||
return block.rows() != 0;
|
||||
}
|
||||
|
||||
bool ClusterCopier::checkPresentPartitionPiecesOnCurrentShard(const ConnectionTimeouts & timeouts,
|
||||
@ -1910,12 +1944,15 @@ bool ClusterCopier::checkPresentPartitionPiecesOnCurrentShard(const ConnectionTi
|
||||
|
||||
auto local_context = Context::createCopy(context);
|
||||
local_context->setSettings(task_cluster->settings_pull);
|
||||
auto result = InterpreterFactory::get(query_ast, local_context)->execute().getInputStream()->read().rows();
|
||||
if (result != 0)
|
||||
auto pipeline = InterpreterFactory::get(query_ast, local_context)->execute().pipeline;
|
||||
PullingPipelineExecutor executor(pipeline);
|
||||
Block result;
|
||||
executor.pull(result);
|
||||
if (result.rows() != 0)
|
||||
LOG_INFO(log, "Partition {} piece number {} is PRESENT on shard {}", partition_quoted_name, std::to_string(current_piece_number), task_shard.getDescription());
|
||||
else
|
||||
LOG_INFO(log, "Partition {} piece number {} is ABSENT on shard {}", partition_quoted_name, std::to_string(current_piece_number), task_shard.getDescription());
|
||||
return result != 0;
|
||||
return result.rows() != 0;
|
||||
}
|
||||
|
||||
|
||||
|
@ -1,6 +1,8 @@
|
||||
#include "Internals.h"
|
||||
#include <Storages/MergeTree/MergeTreeData.h>
|
||||
#include <Storages/extractKeyExpressionList.h>
|
||||
#include <Processors/Executors/PullingPipelineExecutor.h>
|
||||
#include <Processors/Transforms/SquashingChunksTransform.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -63,9 +65,21 @@ BlockInputStreamPtr squashStreamIntoOneBlock(const BlockInputStreamPtr & stream)
|
||||
std::numeric_limits<size_t>::max());
|
||||
}
|
||||
|
||||
Block getBlockWithAllStreamData(const BlockInputStreamPtr & stream)
|
||||
Block getBlockWithAllStreamData(QueryPipeline pipeline)
|
||||
{
|
||||
return squashStreamIntoOneBlock(stream)->read();
|
||||
QueryPipelineBuilder builder;
|
||||
builder.init(std::move(pipeline));
|
||||
builder.addTransform(std::make_shared<SquashingChunksTransform>(
|
||||
builder.getHeader(),
|
||||
std::numeric_limits<size_t>::max(),
|
||||
std::numeric_limits<size_t>::max()));
|
||||
|
||||
auto cur_pipeline = QueryPipelineBuilder::getPipeline(std::move(builder));
|
||||
Block block;
|
||||
PullingPipelineExecutor executor(cur_pipeline);
|
||||
executor.pull(block);
|
||||
|
||||
return block;
|
||||
}
|
||||
|
||||
|
||||
|
@ -165,10 +165,7 @@ std::shared_ptr<ASTStorage> createASTStorageDistributed(
|
||||
const String & cluster_name, const String & database, const String & table,
|
||||
const ASTPtr & sharding_key_ast = nullptr);
|
||||
|
||||
|
||||
BlockInputStreamPtr squashStreamIntoOneBlock(const BlockInputStreamPtr & stream);
|
||||
|
||||
Block getBlockWithAllStreamData(const BlockInputStreamPtr & stream);
|
||||
Block getBlockWithAllStreamData(QueryPipeline pipeline);
|
||||
|
||||
bool isExtendedDefinitionStorage(const ASTPtr & storage_ast);
|
||||
|
||||
|
@ -962,7 +962,7 @@ namespace
|
||||
if (isRunning(pid_file))
|
||||
{
|
||||
throw Exception(ErrorCodes::CANNOT_KILL,
|
||||
"The server process still exists after %zu ms",
|
||||
"The server process still exists after {} tries (delay: {} ms)",
|
||||
num_kill_check_tries, kill_check_delay_ms);
|
||||
}
|
||||
}
|
||||
|
@ -26,7 +26,7 @@
|
||||
#include <Formats/registerFormats.h>
|
||||
#include <Formats/FormatFactory.h>
|
||||
#include <Processors/Formats/IInputFormat.h>
|
||||
#include <Processors/QueryPipeline.h>
|
||||
#include <Processors/QueryPipelineBuilder.h>
|
||||
#include <Processors/Executors/PullingPipelineExecutor.h>
|
||||
#include <Core/Block.h>
|
||||
#include <common/StringRef.h>
|
||||
@ -1162,8 +1162,7 @@ try
|
||||
|
||||
Pipe pipe(FormatFactory::instance().getInput(input_format, file_in, header, context, max_block_size));
|
||||
|
||||
QueryPipeline pipeline;
|
||||
pipeline.init(std::move(pipe));
|
||||
QueryPipeline pipeline(std::move(pipe));
|
||||
PullingPipelineExecutor executor(pipeline);
|
||||
|
||||
Block block;
|
||||
@ -1200,8 +1199,7 @@ try
|
||||
});
|
||||
}
|
||||
|
||||
QueryPipeline pipeline;
|
||||
pipeline.init(std::move(pipe));
|
||||
QueryPipeline pipeline(std::move(pipe));
|
||||
|
||||
BlockOutputStreamPtr output = context->getOutputStreamParallelIfPossible(output_format, file_out, header);
|
||||
|
||||
|
@ -859,6 +859,9 @@ if (ThreadFuzzer::instance().isEffective())
|
||||
if (config->has("max_partition_size_to_drop"))
|
||||
global_context->setMaxPartitionSizeToDrop(config->getUInt64("max_partition_size_to_drop"));
|
||||
|
||||
if (config->has("max_concurrent_queries"))
|
||||
global_context->getProcessList().setMaxSize(config->getInt("max_concurrent_queries", 0));
|
||||
|
||||
if (!initial_loading)
|
||||
{
|
||||
/// We do not load ZooKeeper configuration on the first config loading
|
||||
|
@ -1,22 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
PROGRAM(clickhouse-server)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/base/common
|
||||
clickhouse/base/daemon
|
||||
clickhouse/base/loggers
|
||||
clickhouse/src
|
||||
contrib/libs/poco/NetSSL_OpenSSL
|
||||
)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
clickhouse-server.cpp
|
||||
|
||||
MetricsTransmitter.cpp
|
||||
Server.cpp
|
||||
)
|
||||
|
||||
END()
|
@ -1,32 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
PROGRAM(clickhouse)
|
||||
|
||||
CFLAGS(
|
||||
-DENABLE_CLICKHOUSE_CLIENT
|
||||
-DENABLE_CLICKHOUSE_EXTRACT_FROM_CONFIG
|
||||
-DENABLE_CLICKHOUSE_SERVER
|
||||
)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/base/daemon
|
||||
clickhouse/base/loggers
|
||||
clickhouse/src
|
||||
)
|
||||
|
||||
CFLAGS(-g0)
|
||||
|
||||
SRCS(
|
||||
main.cpp
|
||||
|
||||
client/Client.cpp
|
||||
client/QueryFuzzer.cpp
|
||||
client/ConnectionParameters.cpp
|
||||
client/Suggest.cpp
|
||||
client/TestHint.cpp
|
||||
extract-from-config/ExtractFromConfig.cpp
|
||||
server/Server.cpp
|
||||
server/MetricsTransmitter.cpp
|
||||
)
|
||||
|
||||
END()
|
@ -45,6 +45,7 @@ enum class AccessType
|
||||
M(ALTER_RENAME_COLUMN, "RENAME COLUMN", COLUMN, ALTER_COLUMN) \
|
||||
M(ALTER_MATERIALIZE_COLUMN, "MATERIALIZE COLUMN", COLUMN, ALTER_COLUMN) \
|
||||
M(ALTER_COLUMN, "", GROUP, ALTER_TABLE) /* allow to execute ALTER {ADD|DROP|MODIFY...} COLUMN */\
|
||||
M(ALTER_MODIFY_COMMENT, "MODIFY COMMENT", TABLE, ALTER_TABLE) /* modify table comment */\
|
||||
\
|
||||
M(ALTER_ORDER_BY, "ALTER MODIFY ORDER BY, MODIFY ORDER BY", TABLE, ALTER_INDEX) \
|
||||
M(ALTER_SAMPLE_BY, "ALTER MODIFY SAMPLE BY, MODIFY SAMPLE BY", TABLE, ALTER_INDEX) \
|
||||
|
@ -14,6 +14,7 @@
|
||||
#include <Interpreters/InterpreterCreateUserQuery.h>
|
||||
#include <Interpreters/InterpreterShowGrantsQuery.h>
|
||||
#include <Common/quoteString.h>
|
||||
#include <common/logger_useful.h>
|
||||
#include <Poco/JSON/JSON.h>
|
||||
#include <Poco/JSON/Object.h>
|
||||
#include <Poco/JSON/Stringifier.h>
|
||||
|
@ -62,12 +62,15 @@ void ReplicatedAccessStorage::shutdown()
|
||||
{
|
||||
bool prev_stop_flag = stop_flag.exchange(true);
|
||||
if (!prev_stop_flag)
|
||||
{
|
||||
if (worker_thread.joinable())
|
||||
{
|
||||
/// Notify the worker thread to stop waiting for new queue items
|
||||
refresh_queue.push(UUIDHelpers::Nil);
|
||||
worker_thread.join();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
template <typename Func>
|
||||
static void retryOnZooKeeperUserError(size_t attempts, Func && function)
|
||||
|
46
src/Access/tests/gtest_replicated_access_storage.cpp
Normal file
46
src/Access/tests/gtest_replicated_access_storage.cpp
Normal file
@ -0,0 +1,46 @@
|
||||
#include <gtest/gtest.h>
|
||||
#include <Access/ReplicatedAccessStorage.h>
|
||||
|
||||
using namespace DB;
|
||||
|
||||
namespace DB
|
||||
{
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int NO_ZOOKEEPER;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
TEST(ReplicatedAccessStorage, ShutdownWithoutStartup)
|
||||
{
|
||||
auto get_zk = []()
|
||||
{
|
||||
return std::shared_ptr<zkutil::ZooKeeper>();
|
||||
};
|
||||
|
||||
auto storage = ReplicatedAccessStorage("replicated", "/clickhouse/access", get_zk);
|
||||
storage.shutdown();
|
||||
}
|
||||
|
||||
|
||||
TEST(ReplicatedAccessStorage, ShutdownWithFailedStartup)
|
||||
{
|
||||
auto get_zk = []()
|
||||
{
|
||||
return std::shared_ptr<zkutil::ZooKeeper>();
|
||||
};
|
||||
|
||||
auto storage = ReplicatedAccessStorage("replicated", "/clickhouse/access", get_zk);
|
||||
try
|
||||
{
|
||||
storage.startup();
|
||||
}
|
||||
catch (Exception & e)
|
||||
{
|
||||
if (e.code() != ErrorCodes::NO_ZOOKEEPER)
|
||||
throw;
|
||||
}
|
||||
storage.shutdown();
|
||||
}
|
||||
|
@ -1,54 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
AccessControlManager.cpp
|
||||
AccessEntityIO.cpp
|
||||
AccessRights.cpp
|
||||
AccessRightsElement.cpp
|
||||
AllowedClientHosts.cpp
|
||||
Authentication.cpp
|
||||
ContextAccess.cpp
|
||||
Credentials.cpp
|
||||
DiskAccessStorage.cpp
|
||||
EnabledQuota.cpp
|
||||
EnabledRoles.cpp
|
||||
EnabledRolesInfo.cpp
|
||||
EnabledRowPolicies.cpp
|
||||
EnabledSettings.cpp
|
||||
ExternalAuthenticators.cpp
|
||||
GSSAcceptor.cpp
|
||||
GrantedRoles.cpp
|
||||
IAccessEntity.cpp
|
||||
IAccessStorage.cpp
|
||||
LDAPAccessStorage.cpp
|
||||
LDAPClient.cpp
|
||||
MemoryAccessStorage.cpp
|
||||
MultipleAccessStorage.cpp
|
||||
Quota.cpp
|
||||
QuotaCache.cpp
|
||||
QuotaUsage.cpp
|
||||
ReplicatedAccessStorage.cpp
|
||||
Role.cpp
|
||||
RoleCache.cpp
|
||||
RolesOrUsersSet.cpp
|
||||
RowPolicy.cpp
|
||||
RowPolicyCache.cpp
|
||||
SettingsConstraints.cpp
|
||||
SettingsProfile.cpp
|
||||
SettingsProfileElement.cpp
|
||||
SettingsProfilesCache.cpp
|
||||
SettingsProfilesInfo.cpp
|
||||
User.cpp
|
||||
UsersConfigAccessStorage.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,14 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -155,7 +155,7 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
}
|
||||
|
||||
/// Combinators of aggregate functions.
|
||||
/// For every aggregate function 'agg' and combiner '-Comb' there is combined aggregate function with name 'aggComb',
|
||||
/// For every aggregate function 'agg' and combiner '-Comb' there is a combined aggregate function with the name 'aggComb',
|
||||
/// that can have different number and/or types of arguments, different result type and different behaviour.
|
||||
|
||||
if (AggregateFunctionCombinatorPtr combinator = AggregateFunctionCombinatorFactory::instance().tryFindSuffix(name))
|
||||
@ -172,13 +172,12 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
|
||||
String nested_name = name.substr(0, name.size() - combinator_name.size());
|
||||
/// Nested identical combinators (i.e. uniqCombinedIfIf) is not
|
||||
/// supported (since they even don't work -- silently).
|
||||
/// supported (since they don't work -- silently).
|
||||
///
|
||||
/// But non-identical does supported and works, for example
|
||||
/// uniqCombinedIfMergeIf, it is useful in case when the underlying
|
||||
/// But non-identical is supported and works. For example,
|
||||
/// uniqCombinedIfMergeIf is useful in cases when the underlying
|
||||
/// storage stores AggregateFunction(uniqCombinedIf) and in SELECT you
|
||||
/// need to filter aggregation result based on another column for
|
||||
/// example.
|
||||
/// need to filter aggregation result based on another column.
|
||||
if (!combinator->supportsNesting() && nested_name.ends_with(combinator_name))
|
||||
{
|
||||
throw Exception(ErrorCodes::ILLEGAL_AGGREGATION,
|
||||
@ -234,7 +233,7 @@ std::optional<AggregateFunctionProperties> AggregateFunctionFactory::tryGetPrope
|
||||
return found.properties;
|
||||
|
||||
/// Combinators of aggregate functions.
|
||||
/// For every aggregate function 'agg' and combiner '-Comb' there is combined aggregate function with name 'aggComb',
|
||||
/// For every aggregate function 'agg' and combiner '-Comb' there is a combined aggregate function with the name 'aggComb',
|
||||
/// that can have different number and/or types of arguments, different result type and different behaviour.
|
||||
|
||||
if (AggregateFunctionCombinatorPtr combinator = AggregateFunctionCombinatorFactory::instance().tryFindSuffix(name))
|
||||
|
@ -1,74 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
AggregateFunctionAggThrow.cpp
|
||||
AggregateFunctionAny.cpp
|
||||
AggregateFunctionArray.cpp
|
||||
AggregateFunctionAvg.cpp
|
||||
AggregateFunctionAvgWeighted.cpp
|
||||
AggregateFunctionBitwise.cpp
|
||||
AggregateFunctionBoundingRatio.cpp
|
||||
AggregateFunctionCategoricalInformationValue.cpp
|
||||
AggregateFunctionCombinatorFactory.cpp
|
||||
AggregateFunctionCount.cpp
|
||||
AggregateFunctionDeltaSum.cpp
|
||||
AggregateFunctionDeltaSumTimestamp.cpp
|
||||
AggregateFunctionDistinct.cpp
|
||||
AggregateFunctionEntropy.cpp
|
||||
AggregateFunctionFactory.cpp
|
||||
AggregateFunctionForEach.cpp
|
||||
AggregateFunctionGroupArray.cpp
|
||||
AggregateFunctionGroupArrayInsertAt.cpp
|
||||
AggregateFunctionGroupArrayMoving.cpp
|
||||
AggregateFunctionGroupUniqArray.cpp
|
||||
AggregateFunctionHistogram.cpp
|
||||
AggregateFunctionIf.cpp
|
||||
AggregateFunctionIntervalLengthSum.cpp
|
||||
AggregateFunctionMLMethod.cpp
|
||||
AggregateFunctionMannWhitney.cpp
|
||||
AggregateFunctionMax.cpp
|
||||
AggregateFunctionMaxIntersections.cpp
|
||||
AggregateFunctionMerge.cpp
|
||||
AggregateFunctionMin.cpp
|
||||
AggregateFunctionNull.cpp
|
||||
AggregateFunctionOrFill.cpp
|
||||
AggregateFunctionQuantile.cpp
|
||||
AggregateFunctionRankCorrelation.cpp
|
||||
AggregateFunctionResample.cpp
|
||||
AggregateFunctionRetention.cpp
|
||||
AggregateFunctionSequenceMatch.cpp
|
||||
AggregateFunctionSequenceNextNode.cpp
|
||||
AggregateFunctionSimpleLinearRegression.cpp
|
||||
AggregateFunctionSimpleState.cpp
|
||||
AggregateFunctionSingleValueOrNull.cpp
|
||||
AggregateFunctionSparkbar.cpp
|
||||
AggregateFunctionState.cpp
|
||||
AggregateFunctionStatistics.cpp
|
||||
AggregateFunctionStatisticsSimple.cpp
|
||||
AggregateFunctionStudentTTest.cpp
|
||||
AggregateFunctionSum.cpp
|
||||
AggregateFunctionSumCount.cpp
|
||||
AggregateFunctionSumMap.cpp
|
||||
AggregateFunctionTopK.cpp
|
||||
AggregateFunctionUniq.cpp
|
||||
AggregateFunctionUniqCombined.cpp
|
||||
AggregateFunctionUniqUpTo.cpp
|
||||
AggregateFunctionWelchTTest.cpp
|
||||
AggregateFunctionWindowFunnel.cpp
|
||||
IAggregateFunction.cpp
|
||||
UniqCombinedBiasData.cpp
|
||||
UniqVariadicHash.cpp
|
||||
parseAggregateFunctionParameters.cpp
|
||||
registerAggregateFunctions.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,14 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | grep -v -F GroupBitmap | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -1,27 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
BackupEntryConcat.cpp
|
||||
BackupEntryFromAppendOnlyFile.cpp
|
||||
BackupEntryFromImmutableFile.cpp
|
||||
BackupEntryFromMemory.cpp
|
||||
BackupEntryFromSmallFile.cpp
|
||||
BackupFactory.cpp
|
||||
BackupInDirectory.cpp
|
||||
BackupRenamingConfig.cpp
|
||||
BackupSettings.cpp
|
||||
BackupUtils.cpp
|
||||
hasCompatibleDataToRestoreTable.cpp
|
||||
renameInCreateQuery.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,14 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -1,17 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
IBridgeHelper.cpp
|
||||
LibraryBridgeHelper.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,14 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -212,6 +212,7 @@ add_object_library(clickhouse_processors_formats Processors/Formats)
|
||||
add_object_library(clickhouse_processors_formats_impl Processors/Formats/Impl)
|
||||
add_object_library(clickhouse_processors_transforms Processors/Transforms)
|
||||
add_object_library(clickhouse_processors_sources Processors/Sources)
|
||||
add_object_library(clickhouse_processors_sinks Processors/Sinks)
|
||||
add_object_library(clickhouse_processors_merges Processors/Merges)
|
||||
add_object_library(clickhouse_processors_merges_algorithms Processors/Merges/Algorithms)
|
||||
add_object_library(clickhouse_processors_queryplan Processors/QueryPlan)
|
||||
|
@ -23,7 +23,7 @@
|
||||
#include <Interpreters/ClientInfo.h>
|
||||
#include <Compression/CompressionFactory.h>
|
||||
#include <Processors/Pipe.h>
|
||||
#include <Processors/QueryPipeline.h>
|
||||
#include <Processors/QueryPipelineBuilder.h>
|
||||
#include <Processors/ISink.h>
|
||||
#include <Processors/Executors/PipelineExecutor.h>
|
||||
#include <pcg_random.hpp>
|
||||
@ -700,14 +700,14 @@ void Connection::sendExternalTablesData(ExternalTablesData & data)
|
||||
if (!elem->pipe)
|
||||
elem->pipe = elem->creating_pipe_callback();
|
||||
|
||||
QueryPipeline pipeline;
|
||||
QueryPipelineBuilder pipeline;
|
||||
pipeline.init(std::move(*elem->pipe));
|
||||
elem->pipe.reset();
|
||||
pipeline.resize(1);
|
||||
auto sink = std::make_shared<ExternalTableDataSink>(pipeline.getHeader(), *this, *elem, std::move(on_cancel));
|
||||
pipeline.setSinks([&](const Block &, QueryPipeline::StreamType type) -> ProcessorPtr
|
||||
pipeline.setSinks([&](const Block &, QueryPipelineBuilder::StreamType type) -> ProcessorPtr
|
||||
{
|
||||
if (type != QueryPipeline::StreamType::Main)
|
||||
if (type != QueryPipelineBuilder::StreamType::Main)
|
||||
return nullptr;
|
||||
return sink;
|
||||
});
|
||||
|
@ -1,24 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/poco/NetSSL_OpenSSL
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
Connection.cpp
|
||||
ConnectionEstablisher.cpp
|
||||
ConnectionPool.cpp
|
||||
ConnectionPoolWithFailover.cpp
|
||||
HedgedConnections.cpp
|
||||
HedgedConnectionsFactory.cpp
|
||||
IConnections.cpp
|
||||
MultiplexedConnections.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,15 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/poco/NetSSL_OpenSSL
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -1,43 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL(
|
||||
contrib/libs/icu/common
|
||||
contrib/libs/icu/i18n
|
||||
contrib/libs/pdqsort
|
||||
contrib/libs/lz4
|
||||
)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/icu
|
||||
contrib/libs/pdqsort
|
||||
contrib/libs/lz4
|
||||
)
|
||||
|
||||
SRCS(
|
||||
Collator.cpp
|
||||
ColumnAggregateFunction.cpp
|
||||
ColumnArray.cpp
|
||||
ColumnCompressed.cpp
|
||||
ColumnConst.cpp
|
||||
ColumnDecimal.cpp
|
||||
ColumnFixedString.cpp
|
||||
ColumnFunction.cpp
|
||||
ColumnLowCardinality.cpp
|
||||
ColumnMap.cpp
|
||||
ColumnNullable.cpp
|
||||
ColumnString.cpp
|
||||
ColumnTuple.cpp
|
||||
ColumnVector.cpp
|
||||
ColumnsCommon.cpp
|
||||
FilterDescription.cpp
|
||||
IColumn.cpp
|
||||
MaskOperations.cpp
|
||||
getLeastSuperColumn.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,23 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL(
|
||||
contrib/libs/icu/common
|
||||
contrib/libs/icu/i18n
|
||||
contrib/libs/pdqsort
|
||||
contrib/libs/lz4
|
||||
)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/icu
|
||||
contrib/libs/pdqsort
|
||||
contrib/libs/lz4
|
||||
)
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -34,6 +34,10 @@ using RWLock = std::shared_ptr<RWLockImpl>;
|
||||
/// - SELECT thread 1 locks in the Read mode
|
||||
/// - ALTER tries to lock in the Write mode (waits for SELECT thread 1)
|
||||
/// - SELECT thread 2 tries to lock in the Read mode (waits for ALTER)
|
||||
///
|
||||
/// NOTE: it is dangerous to acquire lock with NO_QUERY, because FastPath doesn't
|
||||
/// exist for this case and deadlock, described in previous note,
|
||||
/// may accur in case of recursive locking.
|
||||
class RWLockImpl : public std::enable_shared_from_this<RWLockImpl>
|
||||
{
|
||||
public:
|
||||
|
@ -145,7 +145,6 @@ protected:
|
||||
Poco::Logger * log = nullptr;
|
||||
|
||||
friend class CurrentThread;
|
||||
friend class PushingToViewsBlockOutputStream;
|
||||
|
||||
/// Use ptr not to add extra dependencies in the header
|
||||
std::unique_ptr<RUsageCounters> last_rusage;
|
||||
@ -188,6 +187,11 @@ public:
|
||||
return query_context.lock();
|
||||
}
|
||||
|
||||
void disableProfiling()
|
||||
{
|
||||
query_profiled_enabled = false;
|
||||
}
|
||||
|
||||
/// Starts new query and create new thread group for it, current thread becomes master thread of the query
|
||||
void initializeQuery();
|
||||
|
||||
@ -222,6 +226,7 @@ public:
|
||||
/// Detaches thread from the thread group and the query, dumps performance counters if they have not been dumped
|
||||
void detachQuery(bool exit_if_already_detached = false, bool thread_exits = false);
|
||||
|
||||
void logToQueryViewsLog(const ViewRuntimeData & vinfo);
|
||||
|
||||
protected:
|
||||
void applyQuerySettings();
|
||||
@ -234,7 +239,6 @@ protected:
|
||||
|
||||
void logToQueryThreadLog(QueryThreadLog & thread_log, const String & current_database, std::chrono::time_point<std::chrono::system_clock> now);
|
||||
|
||||
void logToQueryViewsLog(const ViewRuntimeData & vinfo);
|
||||
|
||||
void assertState(const std::initializer_list<int> & permitted_states, const char * description = nullptr) const;
|
||||
|
||||
|
@ -1,133 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL (
|
||||
GLOBAL clickhouse/src
|
||||
contrib/libs/libcpuid
|
||||
contrib/libs/libunwind/include
|
||||
GLOBAL contrib/restricted/dragonbox
|
||||
)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/base/common
|
||||
clickhouse/base/pcg-random
|
||||
clickhouse/base/widechar_width
|
||||
contrib/libs/libcpuid
|
||||
contrib/libs/openssl
|
||||
contrib/libs/poco/NetSSL_OpenSSL
|
||||
contrib/libs/re2
|
||||
contrib/restricted/dragonbox
|
||||
)
|
||||
|
||||
INCLUDE(${ARCADIA_ROOT}/clickhouse/cmake/yandex/ya.make.versions.inc)
|
||||
|
||||
|
||||
SRCS(
|
||||
ActionLock.cpp
|
||||
AlignedBuffer.cpp
|
||||
Allocator.cpp
|
||||
ClickHouseRevision.cpp
|
||||
Config/AbstractConfigurationComparison.cpp
|
||||
Config/ConfigProcessor.cpp
|
||||
Config/ConfigReloader.cpp
|
||||
Config/YAMLParser.cpp
|
||||
Config/configReadClient.cpp
|
||||
CurrentMemoryTracker.cpp
|
||||
CurrentMetrics.cpp
|
||||
CurrentThread.cpp
|
||||
DNSResolver.cpp
|
||||
Dwarf.cpp
|
||||
Elf.cpp
|
||||
Epoll.cpp
|
||||
ErrorCodes.cpp
|
||||
Exception.cpp
|
||||
FieldVisitorDump.cpp
|
||||
FieldVisitorHash.cpp
|
||||
FieldVisitorSum.cpp
|
||||
FieldVisitorToString.cpp
|
||||
FieldVisitorWriteBinary.cpp
|
||||
FileChecker.cpp
|
||||
IO.cpp
|
||||
IPv6ToBinary.cpp
|
||||
IntervalKind.cpp
|
||||
JSONBuilder.cpp
|
||||
Macros.cpp
|
||||
MemoryStatisticsOS.cpp
|
||||
MemoryTracker.cpp
|
||||
OpenSSLHelpers.cpp
|
||||
OptimizedRegularExpression.cpp
|
||||
PODArray.cpp
|
||||
PipeFDs.cpp
|
||||
ProcfsMetricsProvider.cpp
|
||||
ProfileEvents.cpp
|
||||
ProgressIndication.cpp
|
||||
QueryProfiler.cpp
|
||||
RWLock.cpp
|
||||
RemoteHostFilter.cpp
|
||||
SensitiveDataMasker.cpp
|
||||
SettingsChanges.cpp
|
||||
SharedLibrary.cpp
|
||||
ShellCommand.cpp
|
||||
StackTrace.cpp
|
||||
StatusFile.cpp
|
||||
StatusInfo.cpp
|
||||
Stopwatch.cpp
|
||||
StringUtils/StringUtils.cpp
|
||||
StudentTTest.cpp
|
||||
SymbolIndex.cpp
|
||||
TLDListsHolder.cpp
|
||||
TaskStatsInfoGetter.cpp
|
||||
TerminalSize.cpp
|
||||
ThreadFuzzer.cpp
|
||||
ThreadPool.cpp
|
||||
ThreadProfileEvents.cpp
|
||||
ThreadStatus.cpp
|
||||
Throttler.cpp
|
||||
TimerDescriptor.cpp
|
||||
TraceCollector.cpp
|
||||
UTF8Helpers.cpp
|
||||
UnicodeBar.cpp
|
||||
VersionNumber.cpp
|
||||
WeakHash.cpp
|
||||
ZooKeeper/IKeeper.cpp
|
||||
ZooKeeper/TestKeeper.cpp
|
||||
ZooKeeper/ZooKeeper.cpp
|
||||
ZooKeeper/ZooKeeperCommon.cpp
|
||||
ZooKeeper/ZooKeeperConstants.cpp
|
||||
ZooKeeper/ZooKeeperIO.cpp
|
||||
ZooKeeper/ZooKeeperImpl.cpp
|
||||
ZooKeeper/ZooKeeperNodeCache.cpp
|
||||
checkStackSize.cpp
|
||||
clearPasswordFromCommandLine.cpp
|
||||
clickhouse_malloc.cpp
|
||||
createHardLink.cpp
|
||||
escapeForFileName.cpp
|
||||
filesystemHelpers.cpp
|
||||
formatIPv6.cpp
|
||||
formatReadable.cpp
|
||||
getExecutablePath.cpp
|
||||
getHashOfLoadedBinary.cpp
|
||||
getMappedArea.cpp
|
||||
getMultipleKeysFromConfig.cpp
|
||||
getNumberOfPhysicalCPUCores.cpp
|
||||
hasLinuxCapability.cpp
|
||||
hex.cpp
|
||||
isLocalAddress.cpp
|
||||
isValidUTF8.cpp
|
||||
malloc.cpp
|
||||
new_delete.cpp
|
||||
parseAddress.cpp
|
||||
parseGlobs.cpp
|
||||
parseRemoteDescription.cpp
|
||||
quoteString.cpp
|
||||
randomSeed.cpp
|
||||
remapExecutable.cpp
|
||||
renameat2.cpp
|
||||
setThreadName.cpp
|
||||
thread_local_rng.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,30 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL (
|
||||
GLOBAL clickhouse/src
|
||||
contrib/libs/libcpuid
|
||||
contrib/libs/libunwind/include
|
||||
GLOBAL contrib/restricted/dragonbox
|
||||
)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/base/common
|
||||
clickhouse/base/pcg-random
|
||||
clickhouse/base/widechar_width
|
||||
contrib/libs/libcpuid
|
||||
contrib/libs/openssl
|
||||
contrib/libs/poco/NetSSL_OpenSSL
|
||||
contrib/libs/re2
|
||||
contrib/restricted/dragonbox
|
||||
)
|
||||
|
||||
INCLUDE(${ARCADIA_ROOT}/clickhouse/cmake/yandex/ya.make.versions.inc)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -1,42 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL(
|
||||
contrib/libs/lz4
|
||||
contrib/libs/zstd/include
|
||||
)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/lz4
|
||||
contrib/libs/zstd
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
CachedCompressedReadBuffer.cpp
|
||||
CheckingCompressedReadBuffer.cpp
|
||||
CompressedReadBuffer.cpp
|
||||
CompressedReadBufferBase.cpp
|
||||
CompressedReadBufferFromFile.cpp
|
||||
CompressedWriteBuffer.cpp
|
||||
CompressionCodecDelta.cpp
|
||||
CompressionCodecDoubleDelta.cpp
|
||||
CompressionCodecEncrypted.cpp
|
||||
CompressionCodecGorilla.cpp
|
||||
CompressionCodecLZ4.cpp
|
||||
CompressionCodecMultiple.cpp
|
||||
CompressionCodecNone.cpp
|
||||
CompressionCodecT64.cpp
|
||||
CompressionCodecZSTD.cpp
|
||||
CompressionFactory.cpp
|
||||
CompressionFactoryAdditions.cpp
|
||||
ICompressionCodec.cpp
|
||||
LZ4_decompress_faster.cpp
|
||||
getCompressionCodecForFile.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,21 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
ADDINCL(
|
||||
contrib/libs/lz4
|
||||
contrib/libs/zstd/include
|
||||
)
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/lz4
|
||||
contrib/libs/zstd
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
@ -16,6 +16,7 @@ struct Settings;
|
||||
* and should not be changed by the user without a reason.
|
||||
*/
|
||||
|
||||
|
||||
#define LIST_OF_COORDINATION_SETTINGS(M) \
|
||||
M(Milliseconds, session_timeout_ms, Coordination::DEFAULT_SESSION_TIMEOUT_MS, "Default client session timeout", 0) \
|
||||
M(Milliseconds, operation_timeout_ms, Coordination::DEFAULT_OPERATION_TIMEOUT_MS, "Default client operation timeout", 0) \
|
||||
@ -36,7 +37,8 @@ struct Settings;
|
||||
M(UInt64, max_requests_batch_size, 100, "Max size of batch in requests count before it will be sent to RAFT", 0) \
|
||||
M(Bool, quorum_reads, false, "Execute read requests as writes through whole RAFT consesus with similar speed", 0) \
|
||||
M(Bool, force_sync, true, "Call fsync on each change in RAFT changelog", 0) \
|
||||
M(Bool, compress_logs, true, "Write compressed coordination logs in ZSTD format", 0)
|
||||
M(Bool, compress_logs, true, "Write compressed coordination logs in ZSTD format", 0) \
|
||||
M(Bool, compress_snapshots_with_zstd_format, true, "Write compressed snapshots in ZSTD format (instead of custom LZ4)", 0)
|
||||
|
||||
DECLARE_SETTINGS_TRAITS(CoordinationSettingsTraits, LIST_OF_COORDINATION_SETTINGS)
|
||||
|
||||
|
@ -10,6 +10,7 @@
|
||||
#include <IO/ReadBufferFromFile.h>
|
||||
#include <IO/copyData.h>
|
||||
#include <filesystem>
|
||||
#include <memory>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -32,9 +33,12 @@ namespace
|
||||
return parse<uint64_t>(name_parts[1]);
|
||||
}
|
||||
|
||||
std::string getSnapshotFileName(uint64_t up_to_log_idx)
|
||||
std::string getSnapshotFileName(uint64_t up_to_log_idx, bool compress_zstd)
|
||||
{
|
||||
return std::string{"snapshot_"} + std::to_string(up_to_log_idx) + ".bin";
|
||||
auto base = std::string{"snapshot_"} + std::to_string(up_to_log_idx) + ".bin";
|
||||
if (compress_zstd)
|
||||
base += ".zstd";
|
||||
return base;
|
||||
}
|
||||
|
||||
std::string getBaseName(const String & path)
|
||||
@ -218,6 +222,7 @@ SnapshotMetadataPtr KeeperStorageSnapshot::deserialize(KeeperStorage & storage,
|
||||
storage.zxid = result->get_last_log_idx();
|
||||
storage.session_id_counter = session_id;
|
||||
|
||||
/// Before V1 we serialized ACL without acl_map
|
||||
if (current_version >= SnapshotVersion::V1)
|
||||
{
|
||||
size_t acls_map_size;
|
||||
@ -338,9 +343,13 @@ KeeperStorageSnapshot::~KeeperStorageSnapshot()
|
||||
storage->disableSnapshotMode();
|
||||
}
|
||||
|
||||
KeeperSnapshotManager::KeeperSnapshotManager(const std::string & snapshots_path_, size_t snapshots_to_keep_, const std::string & superdigest_, size_t storage_tick_time_)
|
||||
KeeperSnapshotManager::KeeperSnapshotManager(
|
||||
const std::string & snapshots_path_, size_t snapshots_to_keep_,
|
||||
bool compress_snapshots_zstd_,
|
||||
const std::string & superdigest_, size_t storage_tick_time_)
|
||||
: snapshots_path(snapshots_path_)
|
||||
, snapshots_to_keep(snapshots_to_keep_)
|
||||
, compress_snapshots_zstd(compress_snapshots_zstd_)
|
||||
, superdigest(superdigest_)
|
||||
, storage_tick_time(storage_tick_time_)
|
||||
{
|
||||
@ -380,7 +389,7 @@ std::string KeeperSnapshotManager::serializeSnapshotBufferToDisk(nuraft::buffer
|
||||
{
|
||||
ReadBufferFromNuraftBuffer reader(buffer);
|
||||
|
||||
auto snapshot_file_name = getSnapshotFileName(up_to_log_idx);
|
||||
auto snapshot_file_name = getSnapshotFileName(up_to_log_idx, compress_snapshots_zstd);
|
||||
auto tmp_snapshot_file_name = "tmp_" + snapshot_file_name;
|
||||
std::string tmp_snapshot_path = std::filesystem::path{snapshots_path} / tmp_snapshot_file_name;
|
||||
std::string new_snapshot_path = std::filesystem::path{snapshots_path} / snapshot_file_name;
|
||||
@ -426,22 +435,46 @@ nuraft::ptr<nuraft::buffer> KeeperSnapshotManager::deserializeSnapshotBufferFrom
|
||||
return writer.getBuffer();
|
||||
}
|
||||
|
||||
nuraft::ptr<nuraft::buffer> KeeperSnapshotManager::serializeSnapshotToBuffer(const KeeperStorageSnapshot & snapshot)
|
||||
nuraft::ptr<nuraft::buffer> KeeperSnapshotManager::serializeSnapshotToBuffer(const KeeperStorageSnapshot & snapshot) const
|
||||
{
|
||||
WriteBufferFromNuraftBuffer writer;
|
||||
CompressedWriteBuffer compressed_writer(writer);
|
||||
std::unique_ptr<WriteBufferFromNuraftBuffer> writer = std::make_unique<WriteBufferFromNuraftBuffer>();
|
||||
auto * buffer_raw_ptr = writer.get();
|
||||
std::unique_ptr<WriteBuffer> compressed_writer;
|
||||
if (compress_snapshots_zstd)
|
||||
compressed_writer = wrapWriteBufferWithCompressionMethod(std::move(writer), CompressionMethod::Zstd, 3);
|
||||
else
|
||||
compressed_writer = std::make_unique<CompressedWriteBuffer>(*writer);
|
||||
|
||||
KeeperStorageSnapshot::serialize(snapshot, compressed_writer);
|
||||
compressed_writer.finalize();
|
||||
return writer.getBuffer();
|
||||
KeeperStorageSnapshot::serialize(snapshot, *compressed_writer);
|
||||
compressed_writer->finalize();
|
||||
return buffer_raw_ptr->getBuffer();
|
||||
}
|
||||
|
||||
|
||||
bool KeeperSnapshotManager::isZstdCompressed(nuraft::ptr<nuraft::buffer> buffer)
|
||||
{
|
||||
static constexpr uint32_t ZSTD_COMPRESSED_MAGIC = 0xFD2FB528;
|
||||
ReadBufferFromNuraftBuffer reader(buffer);
|
||||
uint32_t magic_from_buffer;
|
||||
reader.readStrict(reinterpret_cast<char *>(&magic_from_buffer), sizeof(magic_from_buffer));
|
||||
buffer->pos(0);
|
||||
return magic_from_buffer == ZSTD_COMPRESSED_MAGIC;
|
||||
}
|
||||
|
||||
SnapshotMetaAndStorage KeeperSnapshotManager::deserializeSnapshotFromBuffer(nuraft::ptr<nuraft::buffer> buffer) const
|
||||
{
|
||||
ReadBufferFromNuraftBuffer reader(buffer);
|
||||
CompressedReadBuffer compressed_reader(reader);
|
||||
bool is_zstd_compressed = isZstdCompressed(buffer);
|
||||
|
||||
std::unique_ptr<ReadBufferFromNuraftBuffer> reader = std::make_unique<ReadBufferFromNuraftBuffer>(buffer);
|
||||
std::unique_ptr<ReadBuffer> compressed_reader;
|
||||
|
||||
if (is_zstd_compressed)
|
||||
compressed_reader = wrapReadBufferWithCompressionMethod(std::move(reader), CompressionMethod::Zstd);
|
||||
else
|
||||
compressed_reader = std::make_unique<CompressedReadBuffer>(*reader);
|
||||
|
||||
auto storage = std::make_unique<KeeperStorage>(storage_tick_time, superdigest);
|
||||
auto snapshot_metadata = KeeperStorageSnapshot::deserialize(*storage, compressed_reader);
|
||||
auto snapshot_metadata = KeeperStorageSnapshot::deserialize(*storage, *compressed_reader);
|
||||
return std::make_pair(snapshot_metadata, std::move(storage));
|
||||
}
|
||||
|
||||
|
@ -15,10 +15,19 @@ enum SnapshotVersion : uint8_t
|
||||
V0 = 0,
|
||||
V1 = 1, /// with ACL map
|
||||
V2 = 2, /// with 64 bit buffer header
|
||||
V3 = 3, /// compress snapshots with ZSTD codec
|
||||
};
|
||||
|
||||
static constexpr auto CURRENT_SNAPSHOT_VERSION = SnapshotVersion::V2;
|
||||
static constexpr auto CURRENT_SNAPSHOT_VERSION = SnapshotVersion::V3;
|
||||
|
||||
/// In memory keeper snapshot. Keeper Storage based on a hash map which can be
|
||||
/// turned into snapshot mode. This operation is fast and KeeperStorageSnapshot
|
||||
/// class do it in constructor. It also copies iterators from storage hash table
|
||||
/// up to some log index with lock. In destructor this class turn off snapshot
|
||||
/// mode for KeeperStorage.
|
||||
///
|
||||
/// This representation of snapshot have to be serialized into NuRaft
|
||||
/// buffer and send over network or saved to file.
|
||||
struct KeeperStorageSnapshot
|
||||
{
|
||||
public:
|
||||
@ -34,12 +43,20 @@ public:
|
||||
KeeperStorage * storage;
|
||||
|
||||
SnapshotVersion version = CURRENT_SNAPSHOT_VERSION;
|
||||
/// Snapshot metadata
|
||||
SnapshotMetadataPtr snapshot_meta;
|
||||
/// Max session id
|
||||
int64_t session_id;
|
||||
/// Size of snapshot container in amount of nodes after begin iterator
|
||||
/// so we have for loop for (i = 0; i < snapshot_container_size; ++i) { doSmth(begin + i); }
|
||||
size_t snapshot_container_size;
|
||||
/// Iterator to the start of the storage
|
||||
KeeperStorage::Container::const_iterator begin;
|
||||
/// Active sessions and their timeouts
|
||||
SessionAndTimeout session_and_timeout;
|
||||
/// Sessions credentials
|
||||
KeeperStorage::SessionAndAuth session_and_auth;
|
||||
/// ACLs cache for better performance. Without we cannot deserialize storage.
|
||||
std::unordered_map<uint64_t, Coordination::ACLs> acl_map;
|
||||
};
|
||||
|
||||
@ -49,28 +66,42 @@ using CreateSnapshotCallback = std::function<void(KeeperStorageSnapshotPtr &&)>;
|
||||
|
||||
using SnapshotMetaAndStorage = std::pair<SnapshotMetadataPtr, KeeperStoragePtr>;
|
||||
|
||||
/// Class responsible for snapshots serialization and deserialization. Each snapshot
|
||||
/// has it's path on disk and log index.
|
||||
class KeeperSnapshotManager
|
||||
{
|
||||
public:
|
||||
KeeperSnapshotManager(const std::string & snapshots_path_, size_t snapshots_to_keep_, const std::string & superdigest_ = "", size_t storage_tick_time_ = 500);
|
||||
KeeperSnapshotManager(
|
||||
const std::string & snapshots_path_, size_t snapshots_to_keep_,
|
||||
bool compress_snapshots_zstd_ = true, const std::string & superdigest_ = "", size_t storage_tick_time_ = 500);
|
||||
|
||||
/// Restore storage from latest available snapshot
|
||||
SnapshotMetaAndStorage restoreFromLatestSnapshot();
|
||||
|
||||
static nuraft::ptr<nuraft::buffer> serializeSnapshotToBuffer(const KeeperStorageSnapshot & snapshot);
|
||||
/// Compress snapshot and serialize it to buffer
|
||||
nuraft::ptr<nuraft::buffer> serializeSnapshotToBuffer(const KeeperStorageSnapshot & snapshot) const;
|
||||
|
||||
/// Serialize already compressed snapshot to disk (return path)
|
||||
std::string serializeSnapshotBufferToDisk(nuraft::buffer & buffer, uint64_t up_to_log_idx);
|
||||
|
||||
SnapshotMetaAndStorage deserializeSnapshotFromBuffer(nuraft::ptr<nuraft::buffer> buffer) const;
|
||||
|
||||
/// Deserialize snapshot with log index up_to_log_idx from disk into compressed nuraft buffer.
|
||||
nuraft::ptr<nuraft::buffer> deserializeSnapshotBufferFromDisk(uint64_t up_to_log_idx) const;
|
||||
|
||||
/// Deserialize latest snapshot from disk into compressed nuraft buffer.
|
||||
nuraft::ptr<nuraft::buffer> deserializeLatestSnapshotBufferFromDisk();
|
||||
|
||||
/// Remove snapshot with this log_index
|
||||
void removeSnapshot(uint64_t log_idx);
|
||||
|
||||
/// Total amount of snapshots
|
||||
size_t totalSnapshots() const
|
||||
{
|
||||
return existing_snapshots.size();
|
||||
}
|
||||
|
||||
/// The most fresh snapshot log index we have
|
||||
size_t getLatestSnapshotIndex() const
|
||||
{
|
||||
if (!existing_snapshots.empty())
|
||||
@ -80,13 +111,28 @@ public:
|
||||
|
||||
private:
|
||||
void removeOutdatedSnapshotsIfNeeded();
|
||||
|
||||
/// Checks first 4 buffer bytes to became sure that snapshot compressed with
|
||||
/// ZSTD codec.
|
||||
static bool isZstdCompressed(nuraft::ptr<nuraft::buffer> buffer);
|
||||
|
||||
const std::string snapshots_path;
|
||||
/// How many snapshots to keep before remove
|
||||
const size_t snapshots_to_keep;
|
||||
/// All existing snapshots in our path (log_index -> path)
|
||||
std::map<uint64_t, std::string> existing_snapshots;
|
||||
/// Compress snapshots in common ZSTD format instead of custom ClickHouse block LZ4 format
|
||||
const bool compress_snapshots_zstd;
|
||||
/// Superdigest for deserialization of storage
|
||||
const std::string superdigest;
|
||||
/// Storage sessions timeout check interval (also for deserializatopn)
|
||||
size_t storage_tick_time;
|
||||
};
|
||||
|
||||
/// Keeper create snapshots in background thread. KeeperStateMachine just create
|
||||
/// in-memory snapshot from storage and push task for it serialization into
|
||||
/// special tasks queue. Background thread check this queue and after snapshot
|
||||
/// successfully serialized notify state machine.
|
||||
struct CreateSnapshotTask
|
||||
{
|
||||
KeeperStorageSnapshotPtr snapshot;
|
||||
|
@ -46,7 +46,10 @@ KeeperStateMachine::KeeperStateMachine(
|
||||
const CoordinationSettingsPtr & coordination_settings_,
|
||||
const std::string & superdigest_)
|
||||
: coordination_settings(coordination_settings_)
|
||||
, snapshot_manager(snapshots_path_, coordination_settings->snapshots_to_keep, superdigest_, coordination_settings->dead_session_check_period_ms.totalMicroseconds())
|
||||
, snapshot_manager(
|
||||
snapshots_path_, coordination_settings->snapshots_to_keep,
|
||||
coordination_settings->compress_snapshots_with_zstd_format, superdigest_,
|
||||
coordination_settings->dead_session_check_period_ms.totalMicroseconds())
|
||||
, responses_queue(responses_queue_)
|
||||
, snapshots_queue(snapshots_queue_)
|
||||
, last_committed_idx(0)
|
||||
|
@ -934,8 +934,9 @@ void addNode(DB::KeeperStorage & storage, const std::string & path, const std::s
|
||||
|
||||
TEST_P(CoordinationTest, TestStorageSnapshotSimple)
|
||||
{
|
||||
auto params = GetParam();
|
||||
ChangelogDirTest test("./snapshots");
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3);
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3, params.enable_compression);
|
||||
|
||||
DB::KeeperStorage storage(500, "");
|
||||
addNode(storage, "/hello", "world", 1);
|
||||
@ -956,7 +957,7 @@ TEST_P(CoordinationTest, TestStorageSnapshotSimple)
|
||||
|
||||
auto buf = manager.serializeSnapshotToBuffer(snapshot);
|
||||
manager.serializeSnapshotBufferToDisk(*buf, 2);
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_2.bin"));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_2.bin" + params.extension));
|
||||
|
||||
|
||||
auto debuf = manager.deserializeSnapshotBufferFromDisk(2);
|
||||
@ -981,8 +982,9 @@ TEST_P(CoordinationTest, TestStorageSnapshotSimple)
|
||||
|
||||
TEST_P(CoordinationTest, TestStorageSnapshotMoreWrites)
|
||||
{
|
||||
auto params = GetParam();
|
||||
ChangelogDirTest test("./snapshots");
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3);
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3, params.enable_compression);
|
||||
|
||||
DB::KeeperStorage storage(500, "");
|
||||
storage.getSessionID(130);
|
||||
@ -1005,7 +1007,7 @@ TEST_P(CoordinationTest, TestStorageSnapshotMoreWrites)
|
||||
|
||||
auto buf = manager.serializeSnapshotToBuffer(snapshot);
|
||||
manager.serializeSnapshotBufferToDisk(*buf, 50);
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_50.bin"));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_50.bin" + params.extension));
|
||||
|
||||
|
||||
auto debuf = manager.deserializeSnapshotBufferFromDisk(50);
|
||||
@ -1021,8 +1023,9 @@ TEST_P(CoordinationTest, TestStorageSnapshotMoreWrites)
|
||||
|
||||
TEST_P(CoordinationTest, TestStorageSnapshotManySnapshots)
|
||||
{
|
||||
auto params = GetParam();
|
||||
ChangelogDirTest test("./snapshots");
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3);
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3, params.enable_compression);
|
||||
|
||||
DB::KeeperStorage storage(500, "");
|
||||
storage.getSessionID(130);
|
||||
@ -1037,14 +1040,14 @@ TEST_P(CoordinationTest, TestStorageSnapshotManySnapshots)
|
||||
DB::KeeperStorageSnapshot snapshot(&storage, j * 50);
|
||||
auto buf = manager.serializeSnapshotToBuffer(snapshot);
|
||||
manager.serializeSnapshotBufferToDisk(*buf, j * 50);
|
||||
EXPECT_TRUE(fs::exists(std::string{"./snapshots/snapshot_"} + std::to_string(j * 50) + ".bin"));
|
||||
EXPECT_TRUE(fs::exists(std::string{"./snapshots/snapshot_"} + std::to_string(j * 50) + ".bin" + params.extension));
|
||||
}
|
||||
|
||||
EXPECT_FALSE(fs::exists("./snapshots/snapshot_50.bin"));
|
||||
EXPECT_FALSE(fs::exists("./snapshots/snapshot_100.bin"));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_150.bin"));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_200.bin"));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_250.bin"));
|
||||
EXPECT_FALSE(fs::exists("./snapshots/snapshot_50.bin" + params.extension));
|
||||
EXPECT_FALSE(fs::exists("./snapshots/snapshot_100.bin" + params.extension));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_150.bin" + params.extension));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_200.bin" + params.extension));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_250.bin" + params.extension));
|
||||
|
||||
|
||||
auto [meta, restored_storage] = manager.restoreFromLatestSnapshot();
|
||||
@ -1059,8 +1062,9 @@ TEST_P(CoordinationTest, TestStorageSnapshotManySnapshots)
|
||||
|
||||
TEST_P(CoordinationTest, TestStorageSnapshotMode)
|
||||
{
|
||||
auto params = GetParam();
|
||||
ChangelogDirTest test("./snapshots");
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3);
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3, params.enable_compression);
|
||||
DB::KeeperStorage storage(500, "");
|
||||
for (size_t i = 0; i < 50; ++i)
|
||||
{
|
||||
@ -1087,7 +1091,7 @@ TEST_P(CoordinationTest, TestStorageSnapshotMode)
|
||||
auto buf = manager.serializeSnapshotToBuffer(snapshot);
|
||||
manager.serializeSnapshotBufferToDisk(*buf, 50);
|
||||
}
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_50.bin"));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_50.bin" + params.extension));
|
||||
EXPECT_EQ(storage.container.size(), 26);
|
||||
storage.clearGarbageAfterSnapshot();
|
||||
EXPECT_EQ(storage.container.snapshotSize(), 26);
|
||||
@ -1110,8 +1114,9 @@ TEST_P(CoordinationTest, TestStorageSnapshotMode)
|
||||
|
||||
TEST_P(CoordinationTest, TestStorageSnapshotBroken)
|
||||
{
|
||||
auto params = GetParam();
|
||||
ChangelogDirTest test("./snapshots");
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3);
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3, params.enable_compression);
|
||||
DB::KeeperStorage storage(500, "");
|
||||
for (size_t i = 0; i < 50; ++i)
|
||||
{
|
||||
@ -1122,10 +1127,10 @@ TEST_P(CoordinationTest, TestStorageSnapshotBroken)
|
||||
auto buf = manager.serializeSnapshotToBuffer(snapshot);
|
||||
manager.serializeSnapshotBufferToDisk(*buf, 50);
|
||||
}
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_50.bin"));
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_50.bin" + params.extension));
|
||||
|
||||
/// Let's corrupt file
|
||||
DB::WriteBufferFromFile plain_buf("./snapshots/snapshot_50.bin", DBMS_DEFAULT_BUFFER_SIZE, O_APPEND | O_CREAT | O_WRONLY);
|
||||
DB::WriteBufferFromFile plain_buf("./snapshots/snapshot_50.bin" + params.extension, DBMS_DEFAULT_BUFFER_SIZE, O_APPEND | O_CREAT | O_WRONLY);
|
||||
plain_buf.truncate(34);
|
||||
plain_buf.sync();
|
||||
|
||||
@ -1464,6 +1469,52 @@ TEST_P(CoordinationTest, TestCompressedLogsMultipleRewrite)
|
||||
|
||||
}
|
||||
|
||||
TEST_P(CoordinationTest, TestStorageSnapshotDifferentCompressions)
|
||||
{
|
||||
auto params = GetParam();
|
||||
|
||||
ChangelogDirTest test("./snapshots");
|
||||
DB::KeeperSnapshotManager manager("./snapshots", 3, params.enable_compression);
|
||||
|
||||
DB::KeeperStorage storage(500, "");
|
||||
addNode(storage, "/hello", "world", 1);
|
||||
addNode(storage, "/hello/somepath", "somedata", 3);
|
||||
storage.session_id_counter = 5;
|
||||
storage.zxid = 2;
|
||||
storage.ephemerals[3] = {"/hello"};
|
||||
storage.ephemerals[1] = {"/hello/somepath"};
|
||||
storage.getSessionID(130);
|
||||
storage.getSessionID(130);
|
||||
|
||||
DB::KeeperStorageSnapshot snapshot(&storage, 2);
|
||||
|
||||
auto buf = manager.serializeSnapshotToBuffer(snapshot);
|
||||
manager.serializeSnapshotBufferToDisk(*buf, 2);
|
||||
EXPECT_TRUE(fs::exists("./snapshots/snapshot_2.bin" + params.extension));
|
||||
|
||||
DB::KeeperSnapshotManager new_manager("./snapshots", 3, !params.enable_compression);
|
||||
|
||||
auto debuf = new_manager.deserializeSnapshotBufferFromDisk(2);
|
||||
|
||||
auto [snapshot_meta, restored_storage] = new_manager.deserializeSnapshotFromBuffer(debuf);
|
||||
|
||||
EXPECT_EQ(restored_storage->container.size(), 3);
|
||||
EXPECT_EQ(restored_storage->container.getValue("/").children.size(), 1);
|
||||
EXPECT_EQ(restored_storage->container.getValue("/hello").children.size(), 1);
|
||||
EXPECT_EQ(restored_storage->container.getValue("/hello/somepath").children.size(), 0);
|
||||
|
||||
EXPECT_EQ(restored_storage->container.getValue("/").data, "");
|
||||
EXPECT_EQ(restored_storage->container.getValue("/hello").data, "world");
|
||||
EXPECT_EQ(restored_storage->container.getValue("/hello/somepath").data, "somedata");
|
||||
EXPECT_EQ(restored_storage->session_id_counter, 7);
|
||||
EXPECT_EQ(restored_storage->zxid, 2);
|
||||
EXPECT_EQ(restored_storage->ephemerals.size(), 2);
|
||||
EXPECT_EQ(restored_storage->ephemerals[3].size(), 1);
|
||||
EXPECT_EQ(restored_storage->ephemerals[1].size(), 1);
|
||||
EXPECT_EQ(restored_storage->session_and_timeout.size(), 2);
|
||||
}
|
||||
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(CoordinationTestSuite,
|
||||
CoordinationTest,
|
||||
::testing::ValuesIn(std::initializer_list<CompressionParam>{
|
||||
|
@ -1,13 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
SRCS(
|
||||
)
|
||||
|
||||
END()
|
@ -1,12 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
)
|
||||
|
||||
SRCS(
|
||||
)
|
||||
|
||||
END()
|
@ -15,6 +15,7 @@
|
||||
#include <Processors/Executors/PipelineExecutor.h>
|
||||
#include <Processors/Sources/SourceFromInputStream.h>
|
||||
#include <Processors/Sinks/SinkToStorage.h>
|
||||
#include <Processors/Sinks/EmptySink.h>
|
||||
|
||||
#include <Core/ExternalTable.h>
|
||||
#include <Poco/Net/MessageHeader.h>
|
||||
@ -160,14 +161,17 @@ void ExternalTablesHandler::handlePart(const Poco::Net::MessageHeader & header,
|
||||
auto storage = temporary_table.getTable();
|
||||
getContext()->addExternalTable(data->table_name, std::move(temporary_table));
|
||||
auto sink = storage->write(ASTPtr(), storage->getInMemoryMetadataPtr(), getContext());
|
||||
auto exception_handling = std::make_shared<EmptySink>(sink->getOutputPort().getHeader());
|
||||
|
||||
/// Write data
|
||||
data->pipe->resize(1);
|
||||
|
||||
connect(*data->pipe->getOutputPort(0), sink->getPort());
|
||||
connect(*data->pipe->getOutputPort(0), sink->getInputPort());
|
||||
connect(sink->getOutputPort(), exception_handling->getPort());
|
||||
|
||||
auto processors = Pipe::detachProcessors(std::move(*data->pipe));
|
||||
processors.push_back(std::move(sink));
|
||||
processors.push_back(std::move(exception_handling));
|
||||
|
||||
auto executor = std::make_shared<PipelineExecutor>(processors);
|
||||
executor->execute(/*num_threads = */ 1);
|
||||
|
@ -161,6 +161,7 @@ class IColumn;
|
||||
\
|
||||
M(Bool, force_index_by_date, false, "Throw an exception if there is a partition key in a table, and it is not used.", 0) \
|
||||
M(Bool, force_primary_key, false, "Throw an exception if there is primary key in a table, and it is not used.", 0) \
|
||||
M(Bool, use_skip_indexes, true, "Use data skipping indexes during query execution.", 0) \
|
||||
M(String, force_data_skipping_indices, "", "Comma separated list of strings or literals with the name of the data skipping indices that should be used during query execution, otherwise an exception will be thrown.", 0) \
|
||||
\
|
||||
M(Float, max_streams_to_max_threads_ratio, 1, "Allows you to use more sources than the number of threads - to more evenly distribute work across threads. It is assumed that this is a temporary solution, since it will be possible in the future to make the number of sources equal to the number of threads, but for each source to dynamically select available work for itself.", 0) \
|
||||
@ -349,7 +350,7 @@ class IColumn;
|
||||
M(UInt64, max_memory_usage, 0, "Maximum memory usage for processing of single query. Zero means unlimited.", 0) \
|
||||
M(UInt64, max_memory_usage_for_user, 0, "Maximum memory usage for processing all concurrently running queries for the user. Zero means unlimited.", 0) \
|
||||
M(UInt64, max_untracked_memory, (4 * 1024 * 1024), "Small allocations and deallocations are grouped in thread local variable and tracked or profiled only when amount (in absolute value) becomes larger than specified value. If the value is higher than 'memory_profiler_step' it will be effectively lowered to 'memory_profiler_step'.", 0) \
|
||||
M(UInt64, memory_profiler_step, 0, "Whenever query memory usage becomes larger than every next step in number of bytes the memory profiler will collect the allocating stack trace. Zero means disabled memory profiler. Values lower than a few megabytes will slow down query processing.", 0) \
|
||||
M(UInt64, memory_profiler_step, (4 * 1024 * 1024), "Whenever query memory usage becomes larger than every next step in number of bytes the memory profiler will collect the allocating stack trace. Zero means disabled memory profiler. Values lower than a few megabytes will slow down query processing.", 0) \
|
||||
M(Float, memory_profiler_sample_probability, 0., "Collect random allocations and deallocations and write them into system.trace_log with 'MemorySample' trace_type. The probability is for every alloc/free regardless to the size of the allocation. Note that sampling happens only when the amount of untracked memory exceeds 'max_untracked_memory'. You may want to set 'max_untracked_memory' to 0 for extra fine grained sampling.", 0) \
|
||||
\
|
||||
M(UInt64, max_network_bandwidth, 0, "The maximum speed of data exchange over the network in bytes per second for a query. Zero means unlimited.", 0) \
|
||||
|
@ -1,51 +0,0 @@
|
||||
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/sparsehash
|
||||
contrib/restricted/boost/libs
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
BackgroundSchedulePool.cpp
|
||||
BaseSettings.cpp
|
||||
Block.cpp
|
||||
BlockInfo.cpp
|
||||
ColumnWithTypeAndName.cpp
|
||||
ExternalResultDescription.cpp
|
||||
ExternalTable.cpp
|
||||
Field.cpp
|
||||
MySQL/Authentication.cpp
|
||||
MySQL/IMySQLReadPacket.cpp
|
||||
MySQL/IMySQLWritePacket.cpp
|
||||
MySQL/MySQLClient.cpp
|
||||
MySQL/MySQLGtid.cpp
|
||||
MySQL/MySQLReplication.cpp
|
||||
MySQL/PacketEndpoint.cpp
|
||||
MySQL/PacketsConnection.cpp
|
||||
MySQL/PacketsGeneric.cpp
|
||||
MySQL/PacketsProtocolText.cpp
|
||||
MySQL/PacketsReplication.cpp
|
||||
NamesAndTypes.cpp
|
||||
PostgreSQL/Connection.cpp
|
||||
PostgreSQL/PoolWithFailover.cpp
|
||||
PostgreSQL/Utils.cpp
|
||||
PostgreSQL/insertPostgreSQLValue.cpp
|
||||
PostgreSQLProtocol.cpp
|
||||
QueryProcessingStage.cpp
|
||||
ServerUUID.cpp
|
||||
Settings.cpp
|
||||
SettingsEnums.cpp
|
||||
SettingsFields.cpp
|
||||
SettingsQuirks.cpp
|
||||
SortDescription.cpp
|
||||
UUID.cpp
|
||||
iostream_debug_helpers.cpp
|
||||
|
||||
)
|
||||
|
||||
END()
|
@ -1,16 +0,0 @@
|
||||
OWNER(g:clickhouse)
|
||||
|
||||
LIBRARY()
|
||||
|
||||
PEERDIR(
|
||||
clickhouse/src/Common
|
||||
contrib/libs/sparsehash
|
||||
contrib/restricted/boost/libs
|
||||
)
|
||||
|
||||
|
||||
SRCS(
|
||||
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | grep -v -F fuzzers | sed 's/^\.\// /' | sort ?>
|
||||
)
|
||||
|
||||
END()
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user