请参考如下
http://www.tpc.org/information/current_specifications.asp
https://github.com/digoal/pg_tpch
下载tpch
git clone https://github.com/digoal/pg_tpch.git
安装tpch
$ cd gp_tpch
需要在Makefile中设置几个参数:
################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
# SQLSERVER, SYBASE, ORACLE, VECTORWISE
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE=ORACLE
MACHINE = LINUX
WORKLOAD = TPCH
安装
$make
输出dbgen用法:
$./dbgen -h
TPC-H Population Generator (Version 2.17.0 build 0)
Copyright Transaction Processing Performance Council 1994 - 2010
USAGE:
dbgen [-{vf}][-T {pcsoPSOL}]
[-s <scale>][-C <procs>][-S <step>]
dbgen [-v] [-O m] [-s <scale>] [-U <updates>]
Basic Options
===========================
-C <n> -- separate data set into <n> chunks (requires -S, default: 1)
-f -- force. Overwrite existing files
-h -- display this message
-q -- enable QUIET mode
-s <n> -- set Scale Factor (SF) to <n> (default: 1)
-S <n> -- build the <n>th step of the data/update set (used with -C or -U)
-U <n> -- generate <n> update sets
-v -- enable VERBOSE mode
Advanced Options
===========================
-b <s> -- load distributions for <s> (default: dists.dss)
-d <n> -- split deletes between <n> files (requires -U)
-i <n> -- split inserts between <n> files (requires -U)
-T c -- generate cutomers ONLY
-T l -- generate nation/region ONLY
-T L -- generate lineitem ONLY
-T n -- generate nation ONLY
-T o -- generate orders/lineitem ONLY
-T O -- generate orders ONLY
-T p -- generate parts/partsupp ONLY
-T P -- generate parts ONLY
-T r -- generate region ONLY
-T s -- generate suppliers ONLY
-T S -- generate partsupp ONLY
To generate the SF=1 (1GB), validation database population, use:
dbgen -vf -s 1
To generate updates for a SF=1 (1GB), use:
dbgen -v -U 1 -s 1
使用dbgen产生一些测试数据, -s 表示scale(单位为GB),根据需要测试的数据量指定:
$./dbgen -s 100 -f
$ll -h *.tbl
-rw-r--r-- 1 digoal users 12G Nov 19 13:35 customer.tbl
-rw-r--r-- 1 digoal users 6.6G Nov 19 13:39 lineitem.tbl
-rw-r--r-- 1 digoal users 2.2K Nov 19 12:48 nation.tbl
-rw-r--r-- 1 digoal users 1.5G Nov 19 13:39 orders.tbl
-rw-r--r-- 1 digoal users 1.2G Nov 19 12:48 partsupp.tbl
-rw-r--r-- 1 digoal users 233M Nov 19 12:48 part.tbl
-rw-r--r-- 1 digoal users 389 Nov 19 12:48 region.tbl
-rw-r--r-- 1 digoal users 686M Nov 19 13:30 supplier.tbl
生成的测试数据格式如下:
$head -n 10 customer.tbl
1|Customer#000000001|IVhzIApeRb ot,c,E|15|25-989-741-2988|711.56|BUILDING|to the even, regular platelets. regular, ironic epitaphs nag e|
2|Customer#000000002|XSTf4,NCwDVaWNe6tEgvwfmRchLXak|13|23-768-687-3665|121.65|AUTOMOBILE|l accounts. blithely ironic theodolites integrate boldly: caref|
3|Customer#000000003|MG9kdTD2WBHm|1|11-719-748-3364|7498.12|AUTOMOBILE| deposits eat slyly ironic, even instructions. express foxes detect slyly. blithely even accounts abov|
4|Customer#000000004|XxVSJsLAGtn|4|14-128-190-5944|2866.83|MACHINERY| requests. final, regular ideas sleep final accou|
5|Customer#000000005|KvpyuHCplrB84WgAiGV6sYpZq7Tj|3|13-750-942-6364|794.47|HOUSEHOLD|n accounts will have to unwind. foxes cajole accor|
6|Customer#000000006|sKZz0CsnMD7mp4Xd0YrBvx,LREYKUWAh yVn|20|30-114-968-4951|7638.57|AUTOMOBILE|tions. even deposits boost according to the slyly bold packages. final accounts cajole requests. furious|
7|Customer#000000007|TcGe5gaZNgVePxU5kRrvXBfkasDTea|18|28-190-982-9759|9561.95|AUTOMOBILE|ainst the ironic, express theodolites. express, even pinto beans among the exp|
8|Customer#000000008|I0B10bB0AymmC, 0PrRYBCP1yGJ8xcBPmWhl5|17|27-147-574-9335|6819.74|BUILDING|among the slyly regular theodolites kindle blithely courts. carefully even theodolites haggle slyly along the ide|
9|Customer#000000009|xKiAFTjUsCuxfeleNqefumTrjS|8|18-338-906-3675|8324.07|FURNITURE|r theodolites according to the requests wake thinly excuses: pending requests haggle furiousl|
10|Customer#000000010|6LrEaV6KR6PLVcgl2ArL Q3rqzLzcT1 v2|5|15-741-346-9870|2753.54|HOUSEHOLD|es regular deposits haggle. fur|
将测试数据转换为postgresql识别的格式,删除末尾的分隔符|。
$for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; done
$ll -rth *.csv
-rw-r--r-- 1 digoal users 12G Dec 1 13:32 customer.csv
-rw-r--r-- 1 digoal users 6.5G Dec 1 13:33 lineitem.csv
-rw-r--r-- 1 digoal users 2.2K Dec 1 13:33 nation.csv
-rw-r--r-- 1 digoal users 1.5G Dec 1 13:34 orders.csv
-rw-r--r-- 1 digoal users 1.2G Dec 1 13:34 partsupp.csv
-rw-r--r-- 1 digoal users 231M Dec 1 13:34 part.csv
-rw-r--r-- 1 digoal users 384 Dec 1 13:34 region.csv
-rw-r--r-- 1 digoal users 681M Dec 1 13:34 supplier.csv
把包含csv文件的目录,软链接到/tmp/dss-data。
$ ln -s `pwd` /tmp/dss-data
生成测试SQL , SF= dbgen指定的scale值
SF=?
mkdir dss/queries
for q in `seq 1 22`
do
DSS_QUERY=dss/templates ./qgen -s $SF $q > dss/queries/$q.sql
sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql
done
在greenplum数据库中创建数据库和用户(也可以不创建,只要有就可以)
psql
postgres=# create role digoal login;
CREATE ROLE
postgres=# create database postgres template template0 owner digoal;
CREATE DATABASE
配置pg_hba.conf
$ vi $MASTER_DATA_DIRECTORY/pg_hba.conf
host all all 127.0.0.1/32 trust
$ gpstop -u
设置几个参数:
gpconfig -c enable_nestloop -v off
gpconfig -c work_mem -v 256MB
gpstop -u
测试,使用digoal用户连接到postgres数据库,结果输出到./results目录:
自动创建表,加载数据。详见tpch.sh脚本
$ ./tpch.sh ./results ip port tpch-db tpch-user password {row|column}
如果要测试redshift可以这样
Redshift (copy by ssh):
// add manifest file to s3 first
// manifest file must in $S3/${table}.manifest
$ ./tpch.sh ./results ip port tpch-db tpch-user password redshift S3 EC2_ID EC2_KEY
可以在results中查看一些信息,例如每个SQL的explain,时间。
结束后,可以使用以下方法生成CSV报告。
$ php process.php ./results output.csv
测试结果
2015-12-08 10:40:23 [1449542423] : running TPC-H benchmark
2015-12-08 10:40:23 [1449542423] : running queries defined in TPC-H benchmark
2015-12-08 10:40:23 [1449542423] : running query 1
2015-12-08 10:40:43 [1449542443] : query 1 finished OK (10 seconds)
2015-12-08 10:40:43 [1449542443] : running query 2
2015-12-08 10:40:47 [1449542447] : query 2 finished OK (2 seconds)
2015-12-08 10:40:47 [1449542447] : running query 3
2015-12-08 10:40:54 [1449542454] : query 3 finished OK (4 seconds)
2015-12-08 10:40:54 [1449542454] : running query 4
2015-12-08 10:40:56 [1449542456] : query 4 finished OK (1 seconds)
2015-12-08 10:40:56 [1449542456] : running query 5
2015-12-08 10:40:58 [1449542458] : query 5 finished OK (1 seconds)
2015-12-08 10:40:58 [1449542458] : running query 6
2015-12-08 10:40:59 [1449542459] : query 6 finished OK (1 seconds)
2015-12-08 10:40:59 [1449542459] : running query 7
2015-12-08 10:41:10 [1449542470] : query 7 finished OK (6 seconds)
2015-12-08 10:41:10 [1449542470] : running query 8
2015-12-08 10:41:18 [1449542478] : query 8 finished OK (4 seconds)
2015-12-08 10:41:18 [1449542478] : running query 9
2015-12-08 10:41:54 [1449542514] : query 9 finished OK (18 seconds)
2015-12-08 10:41:54 [1449542514] : running query 10
2015-12-08 10:41:55 [1449542515] : query 10 finished OK (1 seconds)
2015-12-08 10:41:55 [1449542515] : running query 11
2015-12-08 10:41:57 [1449542517] : query 11 finished OK (1 seconds)
2015-12-08 10:41:57 [1449542517] : running query 12
2015-12-08 10:41:59 [1449542519] : query 12 finished OK (1 seconds)
2015-12-08 10:41:59 [1449542519] : running query 13
2015-12-08 10:42:04 [1449542524] : query 13 finished OK (3 seconds)
2015-12-08 10:42:04 [1449542524] : running query 14
2015-12-08 10:42:05 [1449542525] : query 14 finished OK (1 seconds)
2015-12-08 10:42:05 [1449542525] : running query 15
2015-12-08 10:42:07 [1449542527] : query 15 finished OK (1 seconds)
2015-12-08 10:42:07 [1449542527] : running query 16
2015-12-08 10:42:10 [1449542530] : query 16 finished OK (2 seconds)
2015-12-08 10:42:10 [1449542530] : running query 17
2015-12-08 10:42:26 [1449542546] : query 17 finished OK (8 seconds)
2015-12-08 10:42:26 [1449542546] : running query 18
2015-12-08 10:42:46 [1449542566] : query 18 finished OK (10 seconds)
2015-12-08 10:42:46 [1449542566] : running query 19
2015-12-08 10:42:49 [1449542569] : query 19 finished OK (2 seconds)
2015-12-08 10:42:49 [1449542569] : running query 20
2015-12-08 10:42:51 [1449542571] : query 20 finished OK (1 seconds)
2015-12-08 10:42:51 [1449542571] : running query 21
2015-12-08 10:43:06 [1449542586] : query 21 finished OK (8 seconds)
2015-12-08 10:43:06 [1449542586] : running query 22
2015-12-08 10:43:10 [1449542590] : query 22 finished OK (2 seconds)
2015-12-08 10:43:10 [1449542590] : finished TPC-H benchmark