MonetDB bulk load performance & fixed length type performance & JOIN performance

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
简介:
之前写过一篇关于PostgreSQL使用UUID作为PK, 造成写入性能很差的分析文章, 有兴趣的童鞋可参考如下 : 

本文主要测试一下几个因素对导入速度的影响.
1. 索引列无序导入
2. 索引列顺序导入
3. 无索引无序导入(指某列无序)
4. 无索引顺序导入 (指某列顺序)
测试数据使用的是前面一篇BLOG的测试数据, 60个字段, 第一个字段是INT, 主键, 其他字段为字符串, 全随机, 字符串没有索引.

测试结果
1. 索引列无序导入 ( 以16为步长的数据. 单个文件顺序连续, 多个文件不连续, 非流式数据 )
sql>copy 3125000 records into b from '/opt/testsql/1.sql';
3125000 affected row (43.3s)
sql>copy 3125000 records into b from '/opt/testsql/2.sql';
3125000 affected row (2m 19s)
sql>copy 3125000 records into b from '/opt/testsql/3.sql';
3125000 affected row (2m 56s)
sql>copy 3125000 records into b from '/opt/testsql/4.sql';
3125000 affected row (3m 11s)
sql>copy 3125000 records into b from '/opt/testsql/5.sql';
3125000 affected row (3m 18s)
sql>copy 3125000 records into b from '/opt/testsql/6.sql';
3125000 affected row (3m 27s)
sql>copy 3125000 records into b from '/opt/testsql/7.sql';
3125000 affected row (3m 22s)
sql>copy 3125000 records into b from '/opt/testsql/8.sql';
3125000 affected row (3m 15s)
sql>copy 3125000 records into b from '/opt/testsql/9.sql';
3125000 affected row (3m 11s)
sql>copy 3125000 records into b from '/opt/testsql/10.sql';
3125000 affected row (3m 10s)
sql>copy 3125000 records into b from '/opt/testsql/11.sql';
3125000 affected row (3m 13s)
sql>copy 3125000 records into b from '/opt/testsql/12.sql';
3125000 affected row (3m 14s)
sql>copy 3125000 records into b from '/opt/testsql/13.sql';
3125000 affected row (3m 15s)
sql>copy 3125000 records into b from '/opt/testsql/14.sql';
3125000 affected row (3m 20s)
sql>copy 3125000 records into b from '/opt/testsql/15.sql';
3125000 affected row (3m 38s)
sql>copy 3125000 records into b from '/opt/testsql/16.sql';
3125000 affected row (3m 38s)

PostgreSQL 每个导入都是230秒左右, 表里面有数据和没有数据导入性能都一样.
但是MonetDB不一样, 当表为空时, 导入速度很快, 有数据时导入速度变慢. 但是导入速度和PG比还是略快.

2. 索引列顺序导入 (需要重新生成测试数据, 参考 http://blog.163.com/digoal@126/blog/static/1638770402014714104326879/, 修改generate_series(1,50000000,16)至 generate_series(1,3125000 ), generate_series(6250000, 6250001 ), ...  generate_series(4687500 1, 50000000 )
导入性能
sql>create table c (like b);
sql>alter table c add constraint pk_c_c1 primary key (c1);
sql>copy 3125000 records into c from '/opt/testsql1/1.sql';
3125000 affected row (1m 16s)
sql>copy 3125000 records into c from '/opt/testsql1/2.sql';
3125000 affected row (3m 3s)
sql>copy 3125000 records into c from '/opt/testsql1/3.sql';
3125000 affected row (3m 3s)
sql>copy 3125000 records into c from '/opt/testsql1/4.sql';
3125000 affected row (3m 16s)
sql>copy 3125000 records into c from '/opt/testsql1/5.sql';
3125000 affected row (3m 18s)
...

导入速度和PK顺序没有关系.

3. 无索引无序导入(指某列无序)
性能同上

4. 无索引顺序导入 (指某列顺序)
性能同上

说明性能问题和PK没什么关系, 和值分布也没什么关系, 主要还是在字符串上.
字符串存储用到了heap, 而定长的int型没有用到heap, 本文用到的测试数据, 在空间比字符串节约12倍.

接下来我们使用数字类型导入看看.
与PostgreSQL行存储的性能对比参考
使用同样的方法产生随机整型数据, 即60个字段全部使用整型 :
psql -c 'copy (
select generate_series(1,3125000) as c1,
trunc(50000000*random()) as c2,
trunc(50000000*random()) as c3,
trunc(50000000*random()) as c4,
trunc(50000000*random()) as c5,
trunc(50000000*random()) as c6,
trunc(50000000*random()) as c7,
trunc(50000000*random()) as c8,
trunc(50000000*random()) as c9,
trunc(50000000*random()) as c10,
trunc(50000000*random()) as c11,
trunc(50000000*random()) as c12,
trunc(50000000*random()) as c13,
trunc(50000000*random()) as c14,
trunc(50000000*random()) as c15,
trunc(50000000*random()) as c16,
trunc(50000000*random()) as c17,
trunc(50000000*random()) as c18,
trunc(50000000*random()) as c19,
trunc(50000000*random()) as c20,
trunc(50000000*random()) as c21,
trunc(50000000*random()) as c22,
trunc(50000000*random()) as c23,
trunc(50000000*random()) as c24,
trunc(50000000*random()) as c25,
trunc(50000000*random()) as c26,
trunc(50000000*random()) as c27,
trunc(50000000*random()) as c28,
trunc(50000000*random()) as c29,
trunc(50000000*random()) as c30,
trunc(50000000*random()) as c31,
trunc(50000000*random()) as c32,
trunc(50000000*random()) as c33,
trunc(50000000*random()) as c34,
trunc(50000000*random()) as c35,
trunc(50000000*random()) as c36,
trunc(50000000*random()) as c37,
trunc(50000000*random()) as c38,
trunc(50000000*random()) as c39,
trunc(50000000*random()) as c40,
trunc(50000000*random()) as c41,
trunc(50000000*random()) as c42,
trunc(50000000*random()) as c43,
trunc(50000000*random()) as c44,
trunc(50000000*random()) as c45,
trunc(50000000*random()) as c46,
trunc(50000000*random()) as c47,
trunc(50000000*random()) as c48,
trunc(50000000*random()) as c49,
trunc(50000000*random()) as c50,
trunc(50000000*random()) as c51,
trunc(50000000*random()) as c52,
trunc(50000000*random()) as c53,
trunc(50000000*random()) as c54,
trunc(50000000*random()) as c55,
trunc(50000000*random()) as c56,
trunc(50000000*random()) as c57,
trunc(50000000*random()) as c58,
trunc(50000000*random()) as c59,
trunc(50000000*random()) as c60
) to $$/data01/postgres/1.sql$$ with 
(format csv, 
delimiter $$|$$, 
quote $$"$$
)'

.... 一共16个文件. 其中c1字段唯一.
MonetDB表结构.
create table t1(c1 int primary key,
c2 int,
c3 int,
c4 int,
c5 int,
c6 int,
c7 int,
c8 int,
c9 int,
c10 int,
c11 int,
c12 int,
c13 int,
c14 int,
c15 int,
c16 int,
c17 int,
c18 int,
c19 int,
c20 int,
c21 int,
c22 int,
c23 int,
c24 int,
c25 int,
c26 int,
c27 int,
c28 int,
c29 int,
c30 int,
c31 int,
c32 int,
c33 int,
c34 int,
c35 int,
c36 int,
c37 int,
c38 int,
c39 int,
c40 int,
c41 int,
c42 int,
c43 int,
c44 int,
c45 int,
c46 int,
c47 int,
c48 int,
c49 int,
c50 int,
c51 int,
c52 int,
c53 int,
c54 int,
c55 int,
c56 int,
c57 int,
c58 int,
c59 int,
c60 int);

导入速度, 相比字符串表快很多. 几乎差10倍的时间.
sql>copy 3125000 records into t1 from '/opt/testsql2/1.sql';
3125000 affected row (11.7s)
sql>copy 3125000 records into t1 from '/opt/testsql2/2.sql';
3125000 affected row (16.2s)
sql>copy 3125000 records into t1 from '/opt/testsql2/3.sql';
3125000 affected row (16.9s)
sql>copy 3125000 records into t1 from '/opt/testsql2/4.sql';
3125000 affected row (24.6s)
sql>copy 3125000 records into t1 from '/opt/testsql2/5.sql';
3125000 affected row (32.4s)
sql>copy 3125000 records into t1 from '/opt/testsql2/6.sql';
3125000 affected row (21.6s)
...
略


存储空间 :  int字段heapsize=0, 所以占用空间小了很多. 差12倍.
sql>select * from "sys"."storage" where "table"='t1';
+--------+-------+------------+------+----------+----------+-----------+------------+----------+---------+--------+
| schema | table | column     | type | location | count    | typewidth | columnsize | heapsize | indices | sorted |
+========+=======+============+======+==========+==========+===========+============+==========+=========+========+
| sys    | t1    | c1         | int  | 10/1043  | 50000000 |         4 |  200000000 |        0 |       0 | true   |
| sys    | t1    | c2         | int  | 33/3314  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c3         | int  | 05/524   | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c4         | int  | 06/627   | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c5         | int  | 07/730   | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c6         | int  | 10/1052  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c7         | int  | 11/1176  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c8         | int  | 12/1277  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c9         | int  | 14/1400  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c10        | int  | 15/1565  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c11        | int  | 16/1665  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c12        | int  | 17/1765  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c13        | int  | 20/2065  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c14        | int  | 21/2165  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c15        | int  | 22/2265  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c16        | int  | 23/2365  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c17        | int  | 24/2465  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c18        | int  | 25/2565  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c19        | int  | 26/2665  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c20        | int  | 27/2765  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c21        | int  | 35/3556  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c22        | int  | 35/3557  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c23        | int  | 35/3560  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c24        | int  | 35/3561  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c25        | int  | 35/3562  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c26        | int  | 35/3563  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c27        | int  | 35/3564  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c28        | int  | 35/3565  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c29        | int  | 35/3566  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c30        | int  | 35/3567  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c31        | int  | 35/3570  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c32        | int  | 35/3571  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c33        | int  | 35/3572  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c34        | int  | 35/3573  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c35        | int  | 35/3574  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c36        | int  | 35/3575  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c37        | int  | 35/3576  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c38        | int  | 35/3577  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c39        | int  | 36/3600  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c40        | int  | 36/3601  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c41        | int  | 36/3602  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c42        | int  | 36/3603  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c43        | int  | 36/3604  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c44        | int  | 36/3605  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c45        | int  | 36/3606  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c46        | int  | 36/3607  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c47        | int  | 36/3610  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c48        | int  | 36/3611  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c49        | int  | 36/3612  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c50        | int  | 36/3613  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c51        | int  | 36/3614  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c52        | int  | 36/3615  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c53        | int  | 36/3616  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c54        | int  | 36/3617  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c55        | int  | 36/3620  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c56        | int  | 36/3621  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c57        | int  | 36/3622  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c58        | int  | 36/3623  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c59        | int  | 36/3624  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | c60        | int  | 36/3625  | 50000000 |         4 |  200000000 |        0 |       0 | false  |
| sys    | t1    | t1_c1_pkey | oid  | 34/3427  |        0 |         8 |          0 |        0 |       0 | true   |
+--------+-------+------------+------+----------+----------+-----------+------------+----------+---------+--------+
61 tuples (5.930ms)


创建16个表, 并将T1表的数据拷贝到其他16个表. 可以看到拷贝速度很快.
带PK约束对导入速度没有太大影响
sql>create table t2 (like t1);
operation successful (15.199ms)
sql>insert into t2 select * from t1;
50000000 affected row (29.7s)
sql>create table t3 (like t1);
operation successful (24.444ms)
sql>insert into t3 select * from t1;
50000000 affected row (29.2s)
sql>create table t4 (like t1);
operation successful (24.291ms)
sql>insert into t4 select * from t1;
50000000 affected row (32.5s)
sql>create table t5 (like t1);
operation successful (17.139ms)
sql>insert into t5 select * from t1;
50000000 affected row (28.0s)
sql>create table t6 (like t1);
operation successful (17.419ms)
sql>insert into t6 select * from t1;
50000000 affected row (26.1s)
sql>create table t7 (like t1);
operation successful (19.314ms)
sql>insert into t7 select * from t1;
50000000 affected row (41.8s)
sql>create table t8 (like t1);
operation successful (23.185ms)
sql>insert into t8 select * from t1;
50000000 affected row (51.2s)
sql>create table t9 (like t1);
operation successful (17.931ms)
sql>insert into t9 select * from t1;
50000000 affected row (32.8s)
sql>create table t10 (like t1);
operation successful (19.037ms)
sql>insert into t10 select * from t1;
50000000 affected row (31.4s)
sql>create table t11 (like t1);
operation successful (27.617ms)
sql>insert into t11 select * from t1;
50000000 affected row (31.9s)
sql>create table t12 (like t1);
operation successful (17.489ms)
sql>insert into t12 select * from t1;
50000000 affected row (31.9s)
sql>create table t13 (like t1);
operation successful (17.383ms)
sql>insert into t13 select * from t1;
50000000 affected row (51.2s)
sql>create table t14 (like t1);
operation successful (18.150ms)
sql>insert into t14 select * from t1;
50000000 affected row (33.2s)
sql>create table t15 (like t1);
operation successful (34.140ms)
sql>insert into t15 select * from t1;
50000000 affected row (32.1s)
sql>create table t16 (like t1);
operation successful (18.520ms)
sql>insert into t16 select * from t1;
50000000 affected row (31.4s)

-- 带PK的表导入
sql>create table t17 (like t1);
operation successful (22.474ms)
sql>alter table t17 add constraint pk_t17_c1 primary key(c1);
operation successful (19.235ms)
sql>insert into t17 select * from t1;
50000000 affected row (35.0s)


查询效率 : 
有PK和没有PK的count(distinct pk)查询效率一样. 使用INT类型性能提升明显.
以下查询在文章开头使用的59个字符串字段的表上需要50秒以上.
sql>select count(distinct c1) from t1;
+----------+
| L1       |
+==========+
| 50000000 |
+----------+
1 tuple (1.8s)
sql>select count(distinct c1) from t2;
+----------+
| L1       |
+==========+
| 50000000 |
+----------+
1 tuple (1.8s)
查看执行计划, 发现一样. 力量上t1.c1是pk, 应该不需要全表扫描. index only scan更快.
sql>explain select count(distinct c1) from t1;
sql>explain select count(distinct c1) from t2;


分组查询效率, 以下查询 在59个字符串字段的表上需要54秒.  如果纯字符串分组查询则需要200秒以上.
sql>select count(*) from (select c1,c11,c21,c31,c41,c51,c60 from t1 group by c1,c11,c21,c31,c41,c51,c60) as t;
+----------+
| L1       |
+==========+
| 50000000 |
+----------+
1 tuple (2.3s)


join的性能,
2个表join, 不使用PK作为关联列
sql>select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3) as t;
+----------+
| L1       |
+==========+
| 49996801 |
+----------+
1 tuple (27.5s)

3个表join , 不使用PK作为关联列
sql>select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t1.c3=t3.c5) as t;
+----------+
| L1       |
+==========+
| 49995545 |
+----------+
1 tuple (1m 12s)
4个表join, 不使用PK作为关联列
sql>select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t1.c3=t3.c5 join t4 on t3.c5=t4.c6) as t;
+----------+
| L1       |
+==========+
| 50001556 |
+----------+
1 tuple (1m 43s)

5个表join , 不使用PK作为关联列
sql>select count(*) from (
more>select t1.c1 from t1 
more>join t2 on t1.c2=t2.c3 
more>join t3 on t2.c3=t3.c4 
more>join t4 on t3.c4=t4.c5
more>join t5 on t4.c5=t5.c6
more>) as t;
+----------+
| L1       |
+==========+
| 50024539 |
+----------+
1 tuple (2m 5s)

5个表join,  不使用PK作为关联列,  加个限制条件, 减少关联集合.
select count(*) from (
select t1.c1 from t1 
join t2 on t1.c2=t2.c3 
join t3 on t2.c3=t3.c4 
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
where t1.c60 < 100000
) as t;
+-------+
| L1    |
+=======+
| 98744 |
+-------+
1 tuple (12.8s)

17个表join , 不使用PK作为关联列
sql>select count(*) from (
more>select t1.c1 from t1 
more>join t2 on t1.c2=t2.c3 
more>join t3 on t2.c3=t3.c4 
more>join t4 on t3.c4=t4.c5
more>join t5 on t4.c5=t5.c6
more>join t6 on t5.c5=t6.c6
more>join t7 on t6.c5=t7.c6
more>join t8 on t7.c5=t8.c6
more>join t9 on t8.c5=t9.c6
more>join t10 on t9.c5=t10.c6
more>join t11 on t10.c5=t11.c6
more>join t12 on t11.c5=t12.c6
more>join t13 on t12.c5=t13.c6
more>join t14 on t13.c5=t14.c6
more>join t15 on t14.c5=t15.c6
more>join t16 on t15.c5=t16.c6
more>join t17 on t16.c5=t17.c6
more>) as t;
+----------+
| L1       |
+==========+
| 50338741 |
+----------+
1 tuple (5m 10s)

17个表join , 不使用PK作为关联列 , 加个限制条件, 减少关联集合.
select count(*) from (
select t1.c1 from t1 
join t2 on t1.c2=t2.c3 
join t3 on t2.c3=t3.c4 
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
join t17 on t16.c5=t17.c6
where t1.c60 < 100000
) as t;
+-------+
| L1    |
+=======+
| 94635 |
+-------+
1 tuple (45.6s)

17个表join , 使用PK作为关联列 (t1.c1=t2.c3)
select count(*) from (
select t1.c1 from t1 
join t2 on t1.c1=t2.c3 
join t3 on t2.c3=t3.c4 
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
join t17 on t16.c5=t17.c6
) as t;
+----------+
| L1       |
+==========+
| 50222475 |
+----------+
1 tuple (4m 22s)


执行计划, SQL转换成MAL(monetdb assemble language) 
sql>explain select count(*) from (
more>select t1.c1 from t1 
more>join t2 on t1.c1=t2.c3 
more>join t3 on t2.c3=t3.c4 
more>join t4 on t3.c4=t4.c5
more>join t5 on t4.c5=t5.c6
more>join t6 on t5.c5=t6.c6
more>join t7 on t6.c5=t7.c6
more>join t8 on t7.c5=t8.c6
more>join t9 on t8.c5=t9.c6
more>join t10 on t9.c5=t10.c6
more>join t11 on t10.c5=t11.c6
more>join t12 on t11.c5=t12.c6
more>join t13 on t12.c5=t13.c6
more>join t14 on t13.c5=t14.c6
more>join t15 on t14.c5=t15.c6
more>join t16 on t15.c5=t16.c6
more>join t17 on t16.c5=t17.c6
more>) as t;
+----------------------------------------------------------------------------------------------------------------------------------+
| mal                                                                                                                              |
+==================================================================================================================================+
| function user.s2_2{autoCommit=true}():void;                                                                                      |
| barrier X_1380 := language.dataflow();                                                                                           |
|     X_2 := sql.mvc();                                                                                                            |
|     X_3:bat[:oid,:oid]  := sql.tid(X_2,"sys","t1");                                                                              |
|     X_6 := sql.bind(X_2,"sys","t1","c1",0);                                                                                      |
|     (X_9,r1_9) := sql.bind(X_2,"sys","t1","c1",2);                                                                               |
|     X_12 := sql.bind(X_2,"sys","t1","c1",1);                                                                                     |
|     X_14 := sql.delta(X_6,X_9,r1_9,X_12);                                                                                        |
---------------省略--------------
|     X_236 := algebra.leftfetchjoin(X_226,X_235);                                                                                 |
|     (X_237,r1_246) := algebra.join(X_225,X_236);                                                                                 |
|     X_239 := sql.bind(X_2,"sys","t14","c5",0);                                                                                   |
|     (X_242,r1_251) := sql.bind(X_2,"sys","t14","c5",2);                                                                          |
|     X_244 := sql.bind(X_2,"sys","t14","c5",1);                                                                                   |
|     X_245 := sql.delta(X_239,X_242,r1_251,X_244);                                                                                |
|     X_246:bat[:oid,:int]  := algebra.leftfetchjoinPath(r1_246,X_226,X_245);                                                      |
|     X_247:bat[:oid,:oid]  := sql.tid(X_2,"sys","t15");                                                                           |
|     X_249 := sql.bind(X_2,"sys","t15","c6",0);                                                                                   |
|     (X_251,r1_261) := sql.bind(X_2,"sys","t15","c6",2);                                                                          |
|     X_253 := sql.bind(X_2,"sys","t15","c6",1);                                                                                   |
|     X_254 := sql.delta(X_249,X_251,r1_261,X_253);                                                                                |
|     X_255 := algebra.leftfetchjoin(X_247,X_254);                                                                                 |
|     (X_256,r1_266) := algebra.join(X_246,X_255);                                                                                 |
|     X_258 := sql.bind(X_2,"sys","t15","c5",0);                                                                                   |
|     (X_259,r1_269) := sql.bind(X_2,"sys","t15","c5",2);                                                                          |
|     X_261 := sql.bind(X_2,"sys","t15","c5",1);                                                                                   |
|     X_262 := sql.delta(X_258,X_259,r1_269,X_261);                                                                                |
|     X_263:bat[:oid,:int]  := algebra.leftfetchjoinPath(r1_266,X_247,X_262);                                                      |
|     X_264:bat[:oid,:oid]  := sql.tid(X_2,"sys","t16");                                                                           |
|     X_266 := sql.bind(X_2,"sys","t16","c6",0);                                                                                   |
|     (X_268,r1_279) := sql.bind(X_2,"sys","t16","c6",2);                                                                          |
|     X_271 := sql.bind(X_2,"sys","t16","c6",1);                                                                                   |
|     X_274 := sql.delta(X_266,X_268,r1_279,X_271);                                                                                |
|     X_275 := algebra.leftfetchjoin(X_264,X_274);                                                                                 |
|     (X_276,r1_287) := algebra.join(X_263,X_275);                                                                                 |
|     X_278 := sql.bind(X_2,"sys","t16","c5",0);                                                                                   |
|     (X_280,r1_291) := sql.bind(X_2,"sys","t16","c5",2);                                                                          |
|     X_282 := sql.bind(X_2,"sys","t16","c5",1);                                                                                   |
|     X_283 := sql.delta(X_278,X_280,r1_291,X_282);                                                                                |
|     X_284:bat[:oid,:int]  := algebra.leftfetchjoinPath(r1_287,X_264,X_283);                                                      |
|     X_342:bat[:oid,:oid]  := sql.tid(X_2,"sys","t17",0,48);                                                                      |
|     X_441:bat[:oid,:int]  := sql.bind(X_2,"sys","t17","c6",0,0,48);                                                              |
|     (X_540:bat[:oid,:oid] ,X_541:bat[:oid,:int] ) := sql.bind(X_2,"sys","t17","c6",2,0,48);                                      |
|     X_879 := sql.delta(X_441,X_540,X_541);                                                                                       |
|     X_975 := algebra.leftfetchjoin(X_342,X_879);                                                                                 |
|     (X_1071,X_1072) := algebra.join(X_284,X_975);                                                                                |
|     X_1361 := algebra.leftfetchjoin(X_256,X_237);                                                                                |
|     X_288:bat[:oid,:int]  := algebra.leftfetchjoinPath(X_276,X_1361,X_218,X_199,X_180,X_161,X_142,X_123,X_103,X_85,X_65,X_53,X_3 |
: 7,X_25,X_15);                                                                                                                    :
|     X_1215 := algebra.leftfetchjoin(X_1071,X_288);                                                                               |
|     X_1312 := aggr.count(X_1215);                                                                                                |
|     X_344:bat[:oid,:oid]  := sql.tid(X_2,"sys","t17",1,48);                                                                      |
|     X_443:bat[:oid,:int]  := sql.bind(X_2,"sys","t17","c6",0,1,48);                                                              |
|     (X_543:bat[:oid,:oid] ,X_544:bat[:oid,:int] ) := sql.bind(X_2,"sys","t17","c6",2,1,48);                                      |
|     X_881 := sql.delta(X_443,X_543,X_544);                                                                                       |
|     X_977 := algebra.leftfetchjoin(X_344,X_881);                                                                                 |
|     (X_1074,X_1075) := algebra.join(X_284,X_977);                                                                                |
|     X_1217 := algebra.leftfetchjoin(X_1074,X_288);                                                                               |
|     X_1313 := aggr.count(X_1217);                                                                                                |
|     X_346:bat[:oid,:oid]  := sql.tid(X_2,"sys","t17",2,48);                                                                      |
|     X_445:bat[:oid,:int]  := sql.bind(X_2,"sys","t17","c6",0,2,48);                                                              |
--------------省略---------------------------

| exit X_1380;                                                                                                                     |
|     sql.exportValue(1,"sys.L1","L1","wrd",64,0,6,X_289,"");                                                                      |
| end s2_2;                                                                                                                        |
| # querylog.define("explain select count(*) from (\nselect t1.c1 from t1 \njoin t2 on t1.c1=t2.c3 \njoin t3 on t2.c3=t3.c4 \njoin |
:  t4 on t3.c4=t4.c5\njoin t5 on t4.c5=t5.c6\njoin t6 on t5.c5=t6.c6\njoin t7 on t6.c5=t7.c6\njoin t8 on t7.c5=t8.c6\njoin t9 on t :
: 8.c5=t9.c6\njoin t10 on t9.c5=t10.c6\njoin t11 on t10.c5=t11.c6\njoin t12 on t11.c5=t12.c6\njoin t13 on t12.c5=t13.c6\njoin t14  :
: on t13.c5=t14.c6\njoin t15 on t14.c5=t15.c6\njoin t16 on t15.c5=t16.c6\njoin t17 on t16.c5=t17.c6\n) as t;","default_pipe")      :
+----------------------------------------------------------------------------------------------------------------------------------+
589 tuples (27.878ms)

MonetDB定长类型效率确实很不错. 这点和PostgreSQL cstore 列存储插件很相似.

[参考 ]
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
SQL 数据库
Database specific hint in One order search
Database specific hint in One order search
115 0
Database specific hint in One order search
|
SQL Oracle 算法
Adaptive and Big Data Scale Parallel Execution in Oracle
在上篇文章中,主要讨论了SQL Server的MPP数仓系统PDW的分布式优化过程,PolarDB的并行优化从中有所借鉴,本篇文章主要看下这篇介绍Oracle并行执行策略的paper,因为在PolarDB的分布式执行策略中,有很多与其有所重叠。
186 0
Adaptive and Big Data Scale Parallel Execution in Oracle
|
SQL Oracle 算法
Cost-based query transformation in Oracle
这篇paper主要介绍了Oracle从10g开始引入的CBQT(Cost Based Query Transformation)框架。虽然以Oracle历来的风格,无法期待它在paper中讨论很多细节,不过这篇还是可以让我们一窥Oracle对于query rewrite的处理思路和很多非常实用的query rewrite模式,对于开发优化器的同学很有参考意义。 值得一提的是,PolarDB目前也在做这方面的工作,而主要的参考正是这篇paper。此外这篇paper的思路和MemSQL optimizer中对query rewrite的处理思路非常接近,关于MemSQL optimizer的介绍可
267 0
Cost-based query transformation in Oracle
|
SQL 关系型数据库 MySQL
Accelerating Queries with Group-By and Join By Groupjoin
这篇paper介绍了HyPer中引入的groupjoin算子,针对 join + group by这种query,可以在某些前提条件下,在join的过程中同时完成grouping+agg的计算。 比如用hash table来实现hash join和group by,就可以避免再创建一个hash table,尤其当join的数据量很大,产生的group结果又较少时,可以很好的提升执行效率。
322 0
Accelerating Queries with Group-By and Join By Groupjoin
|
SQL 存储 分布式计算
F1 Query: Declarative Querying at Scale
2013 年的 F1 是基于 Spanner,主要提供 OLTP 服务,而新的 F1 则定位则是大一统:旨在处理 OLTP/OLAP/ETL 等多种不同的 workload。但是这篇新的 F1 论文对 OLTP 的讨论则是少之又少,据八卦是 Spanner 开始原生支持之前 F1 的部分功能,导致 F1 对 OLTP 的领地被吞并了。
F1 Query: Declarative Querying at Scale
|
SQL Oracle 算法
PostgreSQL 12 preview - plan_cache_mode参数控制强制使用plan cache或强制custom plan (force_custom_plan and force_generic_plan)
标签 PostgreSQL , plan_cache_mode 背景 plan cache在OLTP中,可以大幅降低生成sql parser, 执行计划的开销。 但是在某些场景中,plan cache可能成为问题,比如AP类型的场景中,由于SQL 输入条件的变化(通常AP业务涉及的条件可能比较容易出现这样的问题),可能导致plan cache并不是最佳的执行计划。
1267 0
|
容器 安全 物联网
Speed Matters: How To Process Big Data Securely For Real-time Applications
Big Data processing has stepped up to provide organizations with new tools and technologies to improve business efficiency and competitive advantage.
1264 0
Speed Matters: How To Process Big Data Securely For Real-time Applications
|
SQL 分布式计算 大数据