Updated "star_schema" dataset description (still not ready) [#CLICKHOUSE-3].

This commit is contained in:
Alexey Milovidov 2017-04-12 05:00:18 +03:00
parent 6256a40199
commit 7b78f47774

View File

@ -1,8 +1,28 @@
git clone https://github.com/electrum/ssb-dbgen.git
In shared.h file change MAXAGG_LEN from 10 to 20.
In makefile change MACHINE to Linux
Compile dbgen: https://github.com/vadimtk/ssb-dbgen
```
git clone git@github.com:vadimtk/ssb-dbgen.git
cd ssb-dbgen
make
```
You will see some warnings. It's Ok.
Place `dbgen` and `dists.dss` to some place with at least 200 GB free space available.
Generate data:
```
./dbgen -s 1000 -T c
./dbgen -s 1000 -T l
```
Create tables in ClickHouse: https://github.com/alexey-milovidov/ssb-clickhouse/blob/cc8fd4d9b99859d12a6aaf46b5f1195c7a1034f9/create.sql
For single-node setup, create just MergeTree tables.
For Distributed setup, you must configure cluster `perftest_3shards_1replicas` in configuration file.
Then create MergeTree tables on each node and then create Distributed tables.
Load data (change customer to customerd in case of distributed setup):
```
cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INTO customer FORMAT CSV"
```