注意这里所有的对比都是针对的PostgreSQL行存储, 没有使用cstore_fdw插件, 如果要在PG中使用列存储, 可以考虑一下.
参考 :
本文对比一下MonetDB和PostgreSQL定长字段, 宽表的查询性能和导入性能.
篇幅太长, 测试环境和测试方法见以下BLOG :
本文所述为PostgreSQL的性能, 因为目前PostgreSQL一条SQL只会用到1个核, 所以查询性能比MonetDB更差.
导入性能相比MonetDB也差3倍以上.
关联查询的性能差10倍以上.
COPY导入, 比MonetDB相差3倍以上(MonetDB约20秒导入一个文件).
INSERT INTO TBL SELECT * FROM TBL性能, 比MonetDB差8倍以上.(MonetDB约30秒导入一个文件)
不计索引, 容量比MonetDB大20%. (MonetDB占用11.7G一个表)
查询性能结果, 性能比MonetDB差 10倍-200倍.
接下来的话将对比一下PostgreSQL使用cstore_fdw和MonetDB的性能.
#!/bin/bash
psql <<EOF
\timing
copy t1 from '/data01/postgres/1.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/2.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/3.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/4.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/5.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/6.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/7.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/8.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/9.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/10.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/11.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/12.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/13.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/14.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/15.sql' with csv delimiter '|';
copy t1 from '/data01/postgres/16.sql' with csv delimiter '|';
insert into t2 select * from t1;
insert into t3 select * from t1;
insert into t4 select * from t1;
insert into t5 select * from t1;
insert into t6 select * from t1;
insert into t7 select * from t1;
insert into t8 select * from t1;
insert into t9 select * from t1;
insert into t10 select * from t1;
insert into t11 select * from t1;
insert into t12 select * from t1;
insert into t13 select * from t1;
insert into t14 select * from t1;
insert into t15 select * from t1;
insert into t16 select * from t1;
insert into t17 select * from t1;
\q
EOF
COPY 3125000
Time: 68897.498 ms
COPY 3125000
Time: 69565.301 ms
COPY 3125000
Time: 68971.917 ms
COPY 3125000
Time: 69336.123 ms
COPY 3125000
Time: 69610.727 ms
COPY 3125000
Time: 69873.498 ms
COPY 3125000
Time: 69081.138 ms
COPY 3125000
Time: 69257.640 ms
COPY 3125000
Time: 69753.672 ms
COPY 3125000
Time: 69835.120 ms
COPY 3125000
Time: 69432.020 ms
COPY 3125000
Time: 70308.638 ms
COPY 3125000
Time: 69506.058 ms
COPY 3125000
Time: 69737.094 ms
COPY 3125000
Time: 70411.739 ms
COPY 3125000
Time: 70517.957 ms
INSERT INTO TBL SELECT * FROM TBL性能, 比MonetDB差8倍以上.(MonetDB约30秒导入一个文件)
INSERT 0 50000000
Time: 289370.747 ms
INSERT 0 50000000
Time: 257491.025 ms
INSERT 0 50000000
Time: 257894.757 ms
INSERT 0 50000000
Time: 259908.577 ms
INSERT 0 50000000
Time: 260986.957 ms
INSERT 0 50000000
Time: 261555.060 ms
INSERT 0 50000000
Time: 256580.829 ms
INSERT 0 50000000
Time: 257066.953 ms
INSERT 0 50000000
Time: 259537.231 ms
INSERT 0 50000000
Time: 257184.911 ms
INSERT 0 50000000
Time: 260310.793 ms
INSERT 0 50000000
Time: 261124.638 ms
INSERT 0 50000000
Time: 261805.074 ms
INSERT 0 50000000
Time: 270812.470 ms
INSERT 0 50000000
Time: 274241.552 ms
不计索引, 容量比MonetDB大20%. (MonetDB占用11.7G一个表)
postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+-------+----------+---------+-------------
public | t1 | table | postgres | 13 GB |
public | t10 | table | postgres | 13 GB |
public | t11 | table | postgres | 13 GB |
public | t12 | table | postgres | 13 GB |
public | t13 | table | postgres | 13 GB |
public | t14 | table | postgres | 13 GB |
public | t15 | table | postgres | 13 GB |
public | t16 | table | postgres | 13 GB |
public | t2 | table | postgres | 13 GB |
public | t3 | table | postgres | 13 GB |
public | t4 | table | postgres | 13 GB |
public | t5 | table | postgres | 13 GB |
public | t6 | table | postgres | 13 GB |
public | t7 | table | postgres | 13 GB |
public | t8 | table | postgres | 13 GB |
public | t9 | table | postgres | 13 GB |
(17 rows)
查询性能结果, 性能比MonetDB差 10倍-200倍.
postgres=# select count(distinct c1) from t1;
count
----------
50000000
(1 row)
Time: 60618.999 ms -- MonetDB 1800毫秒, 相差34倍
postgres=# select count(distinct c1) from t2;
count
----------
50000000
(1 row)
Time: 58741.355 ms -- MonetDB 1800毫秒, 相差33倍
执行计划一致
postgres=# explain select count(distinct c1) from t2;
QUERY PLAN
---------------------------------------------------------------------
Aggregate (cost=2291667.10..2291667.11 rows=1 width=4)
-> Seq Scan on t2 (cost=0.00..2166667.08 rows=50000008 width=4)
(2 rows)
Time: 0.648 ms
postgres=# explain select count(distinct c1) from t1;
QUERY PLAN
---------------------------------------------------------------------
Aggregate (cost=2153595.00..2153595.01 rows=1 width=4)
-> Seq Scan on t1 (cost=0.00..2056209.40 rows=38954240 width=4)
(2 rows)
Time: 0.583 ms
postgres=# select count(*) from (select c1,c11,c21,c31,c41,c51,c60 from t1 group by c1,c11,c21,c31,c41,c51,c60) as t;
count
----------
50000000
(1 row)
Time: 112570.233 ms -- MonetDB 2300毫秒, 相差49倍
postgres=# select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3) as t;
count
----------
49996801
(1 row)
Time: 328024.604 ms -- MonetDB 27500毫秒, 相差12倍
postgres=# select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t1.c3=t3.c5) as t;
count
----------
49995545
(1 row)
Time: 736429.344 ms -- MonetDB 72000毫秒, 相差10倍
postgres=# select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t1.c3=t3.c5 join t4 on t3.c5=t4.c6) as t;
count
----------
50001556
(1 row)
Time: 923087.347 ms -- MonetDB 103000毫秒, 相差9倍
select count(*) from (
select t1.c1 from t1
join t2 on t1.c2=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
) as t;
count
----------
50024539
(1 row)
Time: 980814.633 ms -- MonetDB 125000毫秒, 相差8倍
select count(*) from (
select t1.c1 from t1
join t2 on t1.c2=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
where t1.c60 < 100000
) as t;
count
-------
98744
(1 row)
Time: 212111.783 ms -- MonetDB 12800毫秒, 相差17倍
因为空间不够了, 所以t17表没有用到, 所以PostgreSQL只关联了16个表.
select count(*) from (
select t1.c1 from t1
join t2 on t1.c2=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
) as t;
count
----------
50304982
(1 row)
Time: 5306055.745 ms -- MonetDB 310000毫秒, 相差17倍
select count(*) from (
select t1.c1 from t1
join t2 on t1.c2=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
where t1.c60 < 100000
) as t;
count
-------
93771
(1 row)
Time: 2198719.164 ms -- MonetDB 45600毫秒, 相差48倍
select count(*) from (
select t1.c1 from t1
join t2 on t1.c1=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
) as t;
count
----------
50183000
(1 row)
Time: 3980422.581 ms -- MonetDB 262000毫秒, 相差15倍
接下来的话将对比一下PostgreSQL使用cstore_fdw和MonetDB的性能.
[参考]