背景
业界有一些通用的数据库性能测试模型,可以用来测试硬件在不同测试模型下的性能表现。
参考
http://www.tpc.org/
https://github.com/oltpbenchmark/oltpbench
http://oltpbenchmark.com/
本文主要以PostgreSQL为例,向大家介绍一下,如何使用PostgreSQL来测试硬件的性能。
PostgreSQL 的功能非常的强大,所以可以适用于几乎所有的测试模型,同时用户还可以根据自己的应用场景设计测试模型。
前面已经介绍了单机多实例的测试方法。
本文介绍的是单机单实例的测试方法。
一、单机io测试
使用fio测试磁盘或块设备的IO能力。
1. 安装libaio库
# yum install -y libaio libaio-devel
# mkdir -p /data01/digoal
# chown dege.zz /data01/digoal
2. 安装fio
$ git clone https://github.com/axboe/fio
$ cd fio
$ ./configure --prefix=/home/digoal/fiohome
$ make -j 32
$ make install
$ export PATH=/home/digoal/fiohome/bin:$PATH
3. 测试顺序读写,随机读写8K数据块。
fio -filename=/data01/digoal/testdir -direct=1 -thread -rw=write -ioengine=libaio -bs=8K -size=16G -numjobs=128 -runtime=60 -group_reporting -name=mytest >/tmp/fio_write.log 2>&1
fio -filename=/data01/digoal/testdir -direct=1 -thread -rw=read -ioengine=libaio -bs=8K -size=16G -numjobs=128 -runtime=60 -group_reporting -name=mytest >/tmp/fio_read.log 2>&1
fio -filename=/data01/digoal/testdir -direct=1 -thread -rw=randwrite -ioengine=libaio -bs=8K -size=16G -numjobs=128 -runtime=60 -group_reporting -name=mytest >/tmp/fio_randwrite.log 2>&1
fio -filename=/data01/digoal/testdir -direct=1 -thread -rw=randread -ioengine=libaio -bs=8K -size=16G -numjobs=128 -runtime=60 -group_reporting -name=mytest >/tmp/fio_randread.log 2>&1
测试数据关注
bps(MB/S)
lat(min,us)
lat(max,us)
lat(avg,us)
lat(stddev,us)
二、单机数据库准备
测试单实例,所以不使用cgroup
1. 配置环境变量
$ vi ~/envpg.sh
export PS1="$USER@`/bin/hostname -s`-> "
export PGPORT=1921
export PGDATA=/data02/digoal/pg_root_single
export LANG=en_US.utf8
export PGHOME=/home/digoal/pgsql9.6rc1
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH
export DATE=`date +"%Y%m%d%H%M"`
export PATH=/home/fiohome/bin:$PGHOME/bin:$PATH:.
export MANPATH=/home/fiohome/man:$PGHOME/share/man:$MANPATH
export PGHOST=$PGDATA
export PGUSER=postgres
export PGDATABASE=postgres
alias rm='rm -i'
alias ll='ls -lh'
unalias vi
. ~/envpg.sh
2. 初始化数据库集群
$ initdb -D $PGDATA -U postgres -E UTF8 --locale=C -X /data01/digoal/pg_xlog_single
3. 配置数据库
$ cd $PGDATA
$ vi postgresql.conf
listen_addresses = '0.0.0.0' # what IP address(es) to listen on;
port = 1921 # (change requires restart)
max_connections = 300 # (change requires restart)
unix_socket_directories = '.' # comma-separated list of directories
shared_buffers = 96GB # min 128kB
maintenance_work_mem = 2GB # min 1MB
autovacuum_work_mem = 2GB # min 1MB, or -1 to use maintenance_work_mem
dynamic_shared_memory_type = posix # the default is the first option
bgwriter_delay = 10ms # 10-10000ms between rounds
bgwriter_lru_maxpages = 1000 # 0-1000 max buffers written/round
bgwriter_lru_multiplier = 10.0 # 0-10.0 multiplier on buffers scanned/round
wal_buffers = 128MB # min 32kB, -1 sets based on shared_buffers
wal_writer_flush_after = 0 # 0 disables
checkpoint_timeout = 55min # range 30s-1h
max_wal_size = 192GB
checkpoint_completion_target = 0.0 # checkpoint target duration, 0.0 - 1.0
random_page_cost = 1.0 # same scale as above
effective_cache_size = 480GB
constraint_exclusion = on # on, off, or partition
log_destination = 'csvlog' # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
log_truncate_on_rotation = on # If on, an existing log file with the
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose # terse, default, or verbose messages
log_timezone = 'PRC'
autovacuum_vacuum_scale_factor = 0.002 # fraction of table size before vacuum
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C' # locale for system error message
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
synchronous_commit=on
full_page_writes=on
autovacuum_naptime=10s
autovacuum_max_workers=16
$ vi pg_hba.conf
host all all 0.0.0.0/0 trust
4. 启动数据库
$ pg_ctl start
三、单机数据库测试 - TPC-B
1. 初始化TPC-B数据
tpc-b (40亿记录)
准备数据
$ pgbench -i -s 40000
2. 持续测试15天(在客户端机器启动测试,假设客户端POSTGRESQL已部署)
$ vi test.sh
date +%F%T >/tmp/single.log
pgbench -M prepared -n -r -P 1 -c 128 -j 128 -T 1296000 -h 目标机器IP -p 数据库端口 -U postgres postgres >>/tmp/single.log 2>&1
date +%F%T >>/tmp/single.log
$ chmod 700 test.sh
$ nohup ./test.sh >/dev/null 2>&1 &
3. 测试结果
$ cat /tmp/single.log
$ head -n 30000 /tmp/single.log |tail -n 7200 > /tmp/1
$ cat /tmp/1|awk '{print $4 "," $7 "," $10}' >/tmp/2
输出TPS,RT,标准差。
TPS表示数据库视角的事务处理能力(也就是单个测试脚本的每秒调用次数)。
RT表示响应时间。
标准差可以用来表示抖动,通常应该在1以内(越大,说明抖动越厉害)。
4. 主机性能结果
$ sar -f ....
四、单机数据库测试 - 定制测试1
测试分区表upsert 即 insert on conflict
安装分区表插件
$ git clone https://github.com/postgrespro/pg_pathman
$ cd pg_pathman
$ export PATH=/home/digoal/pgsql9.6rc1/bin:$PATH
$ make USE_PGXS=1
$ make install USE_PGXS=1
$ vi $PGDATA/postgresql.conf
shared_preload_libraries='pg_pathman'
$ pg_ctl restart -m fast
$ psql
postgres=# create extension pg_pathman;
CREATE EXTENSION
1. 定制测试脚本
例如测试insert on conflict, 20亿分区表,单表2000万。
$ psql
postgres=# drop table test;
postgres=# create table test(id int primary key, info text, crt_time timestamptz);
CREATE TABLE
postgres=#
postgres=# select create_range_partitions('test'::regclass, 'id', 1, 20000000, 100, false);
create_range_partitions
-------------------------
100
(1 row)
postgres=# select disable_parent('test'::regclass);
disable_parent
----------------
(1 row)
reconnect
postgres=# explain select * from test where id=1;
QUERY PLAN
---------------------------------------------------------------------------------
Append (cost=0.15..2.17 rows=1 width=44)
-> Index Scan using test_1_pkey on test_1 (cost=0.15..2.17 rows=1 width=44)
Index Cond: (id = 1)
(3 rows)
新建的表如下
postgres=# \d+ test_1
Table "postgres.test_1"
Column | Type | Modifiers | Storage | Stats target | Description
----------+-----------------------------+-----------+----------+--------------+-------------
id | integer | not null | plain | |
info | text | | extended | |
crt_time | timestamp without time zone | | plain | |
Indexes:
"test_1_pkey" PRIMARY KEY, btree (id)
Check constraints:
"pathman_test_1_1_check" CHECK (id >= 1 AND id < 20000001)
Inherits: test
postgres=# \d+ test_100
Table "postgres.test_100"
Column | Type | Modifiers | Storage | Stats target | Description
----------+-----------------------------+-----------+----------+--------------+-------------
id | integer | not null | plain | |
info | text | | extended | |
crt_time | timestamp without time zone | | plain | |
Indexes:
"test_100_pkey" PRIMARY KEY, btree (id)
Check constraints:
"pathman_test_100_1_check" CHECK (id >= 1980000001 AND id < 2000000001)
Inherits: test
目前9.6在分区表执行on conflict有个BUG,已提交给社区。
insert into test(id,info,crt_time) values(:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
先使用function代替
create or replace function upsert_test(int,text,timestamptz) returns void as $$
declare
begin
update test set info=$2,crt_time=$3 where id=$1;
if not found then
insert into test(id,info,crt_time) values ($1,$2,$3);
end if;
exception when others then
return;
end;
$$ language plpgsql strict;
pgbench脚本
\set id random(1,2000000000)
select upsert_test(:id,md5(random()::text),now());
2. 测试
$ vi test.sh
date +%F%T >/tmp/single.log
pgbench -M prepared -f test.sql -n -r -P 1 -c 128 -j 128 -T 1296000 -h 目标机器IP -p 数据库端口 -U postgres postgres >>/tmp/single.log 2>&1
date +%F%T >>/tmp/single.log
$ chmod 700 test.sh
$ nohup ./test.sh >/dev/null 2>&1 &
五、单机数据库测试 - 定制测试2
测试单表upsert 即 insert on conflict
1. 定制测试脚本
例如测试insert on conflict, 单表20亿。
$ psql
postgres=# drop table test cascade;
postgres=# create table test(id int primary key, info text, crt_time timestamptz);
CREATE TABLE
pgbench脚本
vi test.sql
\set id random(1,2000000000)
insert into test(id,info,crt_time) values(:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
2. 测试
$ vi test.sh
date +%F%T >/tmp/single.log
pgbench -M prepared -f test.sql -n -r -P 1 -c 128 -j 128 -T 1296000 -h 目标机器IP -p 数据库端口 -U postgres postgres >>/tmp/single.log 2>&1
date +%F%T >>/tmp/single.log
$ chmod 700 test.sh
$ nohup ./test.sh >/dev/null 2>&1 &