mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-17 05:03:20 +00:00
cd14f9ebcb
* split up select.md * array-join.md basic refactoring * distinct.md basic refactoring * format.md basic refactoring * from.md basic refactoring * group-by.md basic refactoring * having.md basic refactoring * additional index.md refactoring * into-outfile.md basic refactoring * join.md basic refactoring * limit.md basic refactoring * limit-by.md basic refactoring * order-by.md basic refactoring * prewhere.md basic refactoring * adjust operators/index.md links * adjust sample.md links * adjust more links * adjust operatots links * fix some links * adjust aggregate function article titles * basic refactor of remaining select clauses * absolute paths in make_links.sh * run make_links.sh * remove old select.md locations * translate docs/es * translate docs/fr * translate docs/fa * remove old operators.md location * change operators.md links * adjust links in docs/es * adjust links in docs/es * minor texts adjustments * wip * update machine translations to use new links * fix changelog * es build fixes * get rid of some select.md links * temporary adjust ru links * temporary adjust more ru links * improve curly brace handling * adjust ru as well * fa build fix * ru link fixes * zh link fixes * temporary disable part of anchor checks
130 lines
3.9 KiB
Markdown
130 lines
3.9 KiB
Markdown
---
|
||
machine_translated: true
|
||
machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd
|
||
toc_priority: 17
|
||
toc_title: "AMPLab B\xFCy\xFCk Veri Benchmark"
|
||
---
|
||
|
||
# AMPLab Büyük Veri Benchmark {#amplab-big-data-benchmark}
|
||
|
||
Bkz. https://amplab.cs.berkeley.edu/benchmark/
|
||
|
||
Ücretsiz bir hesap için kaydolun https://aws.amazon.com. bir kredi kartı, e-posta ve telefon numarası gerektirir. Yeni bir erişim anahtarı alın https://console.aws.amazon.com/iam/home?nc2=h\_m\_sc\#security\_credential
|
||
|
||
Konsolda aşağıdakileri çalıştırın:
|
||
|
||
``` bash
|
||
$ sudo apt-get install s3cmd
|
||
$ mkdir tiny; cd tiny;
|
||
$ s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/tiny/ .
|
||
$ cd ..
|
||
$ mkdir 1node; cd 1node;
|
||
$ s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/1node/ .
|
||
$ cd ..
|
||
$ mkdir 5nodes; cd 5nodes;
|
||
$ s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/5nodes/ .
|
||
$ cd ..
|
||
```
|
||
|
||
Aşağıdaki ClickHouse sorguları çalıştırın:
|
||
|
||
``` sql
|
||
CREATE TABLE rankings_tiny
|
||
(
|
||
pageURL String,
|
||
pageRank UInt32,
|
||
avgDuration UInt32
|
||
) ENGINE = Log;
|
||
|
||
CREATE TABLE uservisits_tiny
|
||
(
|
||
sourceIP String,
|
||
destinationURL String,
|
||
visitDate Date,
|
||
adRevenue Float32,
|
||
UserAgent String,
|
||
cCode FixedString(3),
|
||
lCode FixedString(6),
|
||
searchWord String,
|
||
duration UInt32
|
||
) ENGINE = MergeTree(visitDate, visitDate, 8192);
|
||
|
||
CREATE TABLE rankings_1node
|
||
(
|
||
pageURL String,
|
||
pageRank UInt32,
|
||
avgDuration UInt32
|
||
) ENGINE = Log;
|
||
|
||
CREATE TABLE uservisits_1node
|
||
(
|
||
sourceIP String,
|
||
destinationURL String,
|
||
visitDate Date,
|
||
adRevenue Float32,
|
||
UserAgent String,
|
||
cCode FixedString(3),
|
||
lCode FixedString(6),
|
||
searchWord String,
|
||
duration UInt32
|
||
) ENGINE = MergeTree(visitDate, visitDate, 8192);
|
||
|
||
CREATE TABLE rankings_5nodes_on_single
|
||
(
|
||
pageURL String,
|
||
pageRank UInt32,
|
||
avgDuration UInt32
|
||
) ENGINE = Log;
|
||
|
||
CREATE TABLE uservisits_5nodes_on_single
|
||
(
|
||
sourceIP String,
|
||
destinationURL String,
|
||
visitDate Date,
|
||
adRevenue Float32,
|
||
UserAgent String,
|
||
cCode FixedString(3),
|
||
lCode FixedString(6),
|
||
searchWord String,
|
||
duration UInt32
|
||
) ENGINE = MergeTree(visitDate, visitDate, 8192);
|
||
```
|
||
|
||
Konsola geri dön:
|
||
|
||
``` bash
|
||
$ for i in tiny/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_tiny FORMAT CSV"; done
|
||
$ for i in tiny/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_tiny FORMAT CSV"; done
|
||
$ for i in 1node/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_1node FORMAT CSV"; done
|
||
$ for i in 1node/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_1node FORMAT CSV"; done
|
||
$ for i in 5nodes/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_5nodes_on_single FORMAT CSV"; done
|
||
$ for i in 5nodes/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_5nodes_on_single FORMAT CSV"; done
|
||
```
|
||
|
||
Veri örnekleri almak için sorgular:
|
||
|
||
``` sql
|
||
SELECT pageURL, pageRank FROM rankings_1node WHERE pageRank > 1000
|
||
|
||
SELECT substring(sourceIP, 1, 8), sum(adRevenue) FROM uservisits_1node GROUP BY substring(sourceIP, 1, 8)
|
||
|
||
SELECT
|
||
sourceIP,
|
||
sum(adRevenue) AS totalRevenue,
|
||
avg(pageRank) AS pageRank
|
||
FROM rankings_1node ALL INNER JOIN
|
||
(
|
||
SELECT
|
||
sourceIP,
|
||
destinationURL AS pageURL,
|
||
adRevenue
|
||
FROM uservisits_1node
|
||
WHERE (visitDate > '1980-01-01') AND (visitDate < '1980-04-01')
|
||
) USING pageURL
|
||
GROUP BY sourceIP
|
||
ORDER BY totalRevenue DESC
|
||
LIMIT 1
|
||
```
|
||
|
||
[Orijinal makale](https://clickhouse.tech/docs/en/getting_started/example_datasets/amplab_benchmark/) <!--hide-->
|