mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-09-20 08:40:50 +00:00
Added draft texts [#METR-20000].
This commit is contained in:
parent
e4f23a53f1
commit
c8aed08fdf
151
doc/drafts/build.txt
Normal file
151
doc/drafts/build.txt
Normal file
@ -0,0 +1,151 @@
|
||||
# How to build ClickHouse
|
||||
#
|
||||
# Build should work on Linux Ubuntu 14.04 or newer.
|
||||
# With appropriate changes, build should work on any other Linux distribution.
|
||||
# Build is not intended to work on Mac OS X.
|
||||
|
||||
sudo apt-get install git cmake
|
||||
|
||||
# Install GCC 5.
|
||||
# There are several ways to do it.
|
||||
#
|
||||
# 1. If you run on Ubuntu 15.10 or newer, just do
|
||||
# sudo apt-get install g++-5
|
||||
#
|
||||
# 2. Install from PPA package.
|
||||
|
||||
sudo apt-get install software-properties-common
|
||||
sudo apt-add-repository ppa:ubuntu-toolchain-r/test
|
||||
sudo apt-get update
|
||||
sudo apt-get install gcc-5 g++-5
|
||||
|
||||
export THREADS=$(grep -c ^processor /proc/cpuinfo)
|
||||
|
||||
# 3. Install GCC 5 from sources.
|
||||
#
|
||||
# Download gcc from https://gcc.gnu.org/mirrors.html
|
||||
# Example:
|
||||
# wget ftp://ftp.fu-berlin.de/unix/languages/gcc/releases/gcc-5.3.0/gcc-5.3.0.tar.bz2
|
||||
# tar xf gcc-5.3.0.tar.bz2
|
||||
# cd gcc-5.3.0
|
||||
# ./contrib/download_prerequisites
|
||||
# cd ..
|
||||
# mkdir gcc-build
|
||||
# cd gcc-build
|
||||
# ../gcc-5.3.0/configure --enable-languages=c,c++
|
||||
# make -j $THREADS
|
||||
# sudo make install
|
||||
# hash gcc g++
|
||||
# gcc --version
|
||||
# sudo ln -s /usr/local/bin/gcc /usr/local/bin/gcc-5
|
||||
# sudo ln -s /usr/local/bin/g++ /usr/local/bin/g++-5
|
||||
# sudo ln -s /usr/local/bin/cc /usr/local/bin/gcc-5
|
||||
# sudo ln -s /usr/local/bin/c++ /usr/local/bin/g++-5
|
||||
# /usr/local/bin/ should be in $PATH
|
||||
#
|
||||
# Note that these ways of installation differs.
|
||||
# When installing from PPA, by default, "old C++ ABI" is used,
|
||||
# and when installing from sources, "new C++ ABI" is used.
|
||||
# When using different C++ ABI, you need to recompile all C++ libraries,
|
||||
# otherwise libraries will not link.
|
||||
# ClickHouse works with both old and new C++ ABI,
|
||||
# but production releases is built with old C++ ABI.
|
||||
|
||||
export CC=gcc-5
|
||||
export CXX=g++-5
|
||||
|
||||
# Install required libraries from packages
|
||||
|
||||
sudo apt-get install libicu-dev libglib2.0-dev libreadline-dev libzookeeper-mt-dev libmysqlclient-dev libssl-dev unixodbc-dev
|
||||
|
||||
# Install recent version of boost. Version 1.57 or newer will be Ok.
|
||||
|
||||
wget http://downloads.sourceforge.net/project/boost/boost/1.60.0/boost_1_60_0.tar.bz2
|
||||
tar xf boost_1_60_0.tar.bz2
|
||||
cd boost_1_60_0
|
||||
./bootstrap.sh
|
||||
./b2 --toolset=gcc-5 -j $THREADS
|
||||
sudo ./b2 install --toolset=gcc-5 -j $THREADS
|
||||
cd ..
|
||||
|
||||
# Install tcmalloc. Patch is important.
|
||||
|
||||
wget https://googledrive.com/host/0B6NtGsLhIcf7MWxMMF9JdTN3UVk/gperftools-2.4.tar.gz
|
||||
tar -xf gperftools-2.4.tar.gz
|
||||
cd gperftools-2.4
|
||||
patch src/static_vars.cc <<END
|
||||
103c103
|
||||
< TCMallocGetenvSafe("TCMALLOC_AGGRESSIVE_DECOMMIT"), true);
|
||||
---
|
||||
> TCMallocGetenvSafe("TCMALLOC_AGGRESSIVE_DECOMMIT"), false);
|
||||
END
|
||||
./configure --enable-minimal
|
||||
make -j $THREADS
|
||||
sudo make install
|
||||
cd ..
|
||||
|
||||
# Install mongoclient. This library is needed only for 'external dictionaries' with MongoDB source. This is rarely used but enabled by default.
|
||||
|
||||
sudo apt-get install scons
|
||||
git clone -b legacy https://github.com/mongodb/mongo-cxx-driver.git
|
||||
cd mongo-cxx-driver
|
||||
sudo scons --c++11 --release --cc=$CC --cxx=$CXX --disable-warnings-as-errors -j $THREADS --prefix=/usr/local install
|
||||
cd ..
|
||||
|
||||
# Checkout ClickHouse sources.
|
||||
|
||||
git clone git@███████████.yandex-team.ru:Metrika/ClickHouse.git # TODO Change path.
|
||||
cd ClickHouse
|
||||
|
||||
# There are two variants of build.
|
||||
# 1. Build release package.
|
||||
|
||||
# Install prerequisites to build debian packages.
|
||||
sudo apt-get install devscripts dupload fakeroot debhelper
|
||||
|
||||
# Install recent version of clang. Clang is embedded into ClickHouse package and used at runtime.
|
||||
|
||||
cd ..
|
||||
sudo apt-get install subversion
|
||||
mkdir llvm
|
||||
cd llvm
|
||||
svn co http://llvm.org/svn/llvm-project/llvm/tags/RELEASE_380/final llvm
|
||||
cd llvm/tools
|
||||
svn co http://llvm.org/svn/llvm-project/cfe/tags/RELEASE_380/final clang
|
||||
cd ..
|
||||
cd projects/
|
||||
svn co http://llvm.org/svn/llvm-project/compiler-rt/tags/RELEASE_380/final compiler-rt
|
||||
cd ../..
|
||||
mkdir build
|
||||
cd build/
|
||||
cmake -D CMAKE_BUILD_TYPE:STRING=Release ../llvm
|
||||
make -j $THREADS
|
||||
sudo make install
|
||||
hash clang
|
||||
|
||||
# You may also build ClickHouse with clang for development purposes.
|
||||
# For production releases, GCC is used.
|
||||
|
||||
# Run release script.
|
||||
rm -f ../clickhouse*.deb
|
||||
./release
|
||||
|
||||
# debsign and dupload will not work by default.
|
||||
# It's Ok. You will find built packages in parent directory.
|
||||
# ls -l ../clickhouse*.deb
|
||||
|
||||
# Note that usage of debian packages is not required.
|
||||
# ClickHouse has no runtime dependencies except libc,
|
||||
# so it could work on almost any Linux.
|
||||
|
||||
# Installing just built packages on development server.
|
||||
sudo dpkg -i ../clickhouse*.deb
|
||||
sudo service clickhouse-server start
|
||||
|
||||
# 2. Build to work with code.
|
||||
#
|
||||
# mkdir build
|
||||
# cd build
|
||||
# cmake ..
|
||||
# make -j $THREADS
|
||||
# cd ..
|
85
doc/drafts/site.txt
Normal file
85
doc/drafts/site.txt
Normal file
@ -0,0 +1,85 @@
|
||||
ClickHouse is free column-oriented DBMS for big data.
|
||||
|
||||
ClickHouse powers Yandex.Metrica - second largest web analytics system in the world.
|
||||
In Yandex.Metrica, all incoming data is ingested into ClickHouse in realtime (about 20 billion events each day).
|
||||
Currently, Yandex.Metrica has more than 13 trillion records in ClickHouse powered database.
|
||||
It is used for fully customizable reports, that are generated on-the-fly, directly from non-aggregated data.
|
||||
Yandex.Metrica allows customers to slice and dice data in every detail, even for huge traffic sites, with instant results.
|
||||
|
||||
ClickHouse is the only open-source system, that is capable of doing such kind of things.
|
||||
|
||||
|
||||
Big Data
|
||||
|
||||
|
||||
Linearly scalable
|
||||
|
||||
ClickHouse allows to add servers to cluster when necessary.
|
||||
For example, in Yandex.Metrica, main cluster has grown from 60 to 394 servers in two years.
|
||||
Servers are placed in six different geographically distributed datacenters.
|
||||
|
||||
ClickHouse is using maximum of available hardware to process queries as fast as possible.
|
||||
We achieve peak performance of more than 2 terabytes per second for single query (data after decompression, only used columns).
|
||||
|
||||
ClickHouse scales well both vertically and horizontally.
|
||||
We have installations with more than two trillion rows per single node, and another installations with 100 TB of storage per single node.
|
||||
|
||||
|
||||
Efficient use of hardware
|
||||
|
||||
ClickHouse is space and time efficient. All data is stored compressed. Compression works surprisingly good, thanks to column store.
|
||||
ClickHouse constantly maintains data locality for loaded data. It minimizes number of seeks for range queries, so ClickHouse works fine on cheap rotational drives.
|
||||
ClickHouse is also CPU efficient,
|
||||
ClickHouse is using IO throughput in
|
||||
|
||||
|
||||
Fast
|
||||
|
||||
We proud of high performance of ClickHouse. Throughput of query processing per single server is usually from hundreds millions to more than billion rows per second and to more than tens of gigabytes per second. It's hard to believe that is possible to process data in such high rates. But you don't need to beleive, because ClickHouse is actually do that.
|
||||
|
||||
On our performance testing, ClickHouse works few times faster than best of available commercial column-oriented DBMS.
|
||||
|
||||
|
||||
Fault tolerance
|
||||
|
||||
|
||||
Feature rich
|
||||
|
||||
|
||||
Simple and handy
|
||||
|
||||
|
||||
Stable
|
||||
|
||||
|
||||
Opens new possibilities
|
||||
|
||||
|
||||
|
||||
|
||||
True column-oriented
|
||||
Vectorized query execution
|
||||
Data compression
|
||||
Parallel and distributed query execution
|
||||
Realtime data ingestion
|
||||
On-disk data locality
|
||||
Online query processing
|
||||
Cross-datacenter replication
|
||||
High availability
|
||||
SQL support
|
||||
Support for approximate query processing
|
||||
Sketching data structures
|
||||
Full support of IPv6
|
||||
Features for web analytics
|
||||
State-of-the-art algorithms
|
||||
Clean documented code
|
||||
|
||||
|
||||
Web and application analytics
|
||||
Advertisement networks and RTB
|
||||
Telecommunications analytics
|
||||
E-commerce analytics
|
||||
Analytics for information security
|
||||
Monitoring and telemetry
|
||||
Business intelligence
|
||||
Analytics for Internet of Things
|
Loading…
Reference in New Issue
Block a user