From 7b78f47774ac47444ee87e5b9fbe5c4565fbe164 Mon Sep 17 00:00:00 2001 From: Alexey Milovidov Date: Wed, 12 Apr 2017 05:00:18 +0300 Subject: [PATCH] Updated "star_schema" dataset description (still not ready) [#CLICKHOUSE-3]. --- doc/example_datasets/star_schema.txt | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/doc/example_datasets/star_schema.txt b/doc/example_datasets/star_schema.txt index 3a5224f782b..540ea924b36 100644 --- a/doc/example_datasets/star_schema.txt +++ b/doc/example_datasets/star_schema.txt @@ -1,8 +1,28 @@ -git clone https://github.com/electrum/ssb-dbgen.git - -In shared.h file change MAXAGG_LEN from 10 to 20. -In makefile change MACHINE to Linux +Compile dbgen: https://github.com/vadimtk/ssb-dbgen +``` +git clone git@github.com:vadimtk/ssb-dbgen.git +cd ssb-dbgen make +``` +You will see some warnings. It's Ok. +Place `dbgen` and `dists.dss` to some place with at least 200 GB free space available. +Generate data: + +``` +./dbgen -s 1000 -T c +./dbgen -s 1000 -T l +``` + +Create tables in ClickHouse: https://github.com/alexey-milovidov/ssb-clickhouse/blob/cc8fd4d9b99859d12a6aaf46b5f1195c7a1034f9/create.sql + +For single-node setup, create just MergeTree tables. +For Distributed setup, you must configure cluster `perftest_3shards_1replicas` in configuration file. +Then create MergeTree tables on each node and then create Distributed tables. + +Load data (change customer to customerd in case of distributed setup): +``` +cat customer.tbl | sed 's/$/2000-01-01/' | clickhouse-client --query "INSERT INTO customer FORMAT CSV" +```