标签
PostgreSQL , Greenplum , 分布式查询 , 分布式执行计划 , 点查
背景
Greenplum是分布式数据库,在建表时,可以指定随机分布、或按指定字段或多个字段进行分布。
因此在做点查时,Greenplum可以根据WHERE条件来判断是需要在所有节点查询,还是只需要到若干节点查询。
假设tbl按id分布,那么下面几种情况应该到哪个节点查询呢:
-- 到单个节点查询
select * from tbl where id=?
-- 到若干个节点查询
select * from tbl where id in (?,?,?)
-- 到所有节点查询
select * from tbl where id >= ? and id < ?;
-- 到所有节点查询
select * from tbl where col1 ...;
到多个节点查询和到一个节点查询,性能是不一样的。因此我们就有了优化的方法。
点查的优化与分布式QUERY
测试
1、构建测试表和数据
create table tbl2 (id int, c1 int) DISTRIBUTED BY (id);
insert into tbl2 select id,id from generate_series(1,10000000) t(id);
2、创建索引
create index idx_tbl2_1 on tbl2(id);
create index idx_tbl2_2 on tbl2(c1);
3、按分布键,点查,只需要查询一个节点。
postgres=# explain analyze select * from tbl2 where id=1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..200.28 rows=1 width=8)
Rows out: 1 rows at destination with 2.421 ms to first row, 2.422 ms to end, start offset by 0.212 ms.
-> Index Scan using idx_tbl2_1 on tbl2 (cost=0.00..200.28 rows=1 width=8)
Index Cond: id = 1
Rows out: 1 rows with 0.017 ms to first row, 0.019 ms to end, start offset by 2.576 ms.
Slice statistics:
(slice0) Executor memory: 226K bytes.
(slice1) Executor memory: 172K bytes (seg42).
Statement statistics:
Memory used: 128000K bytes
Settings: enable_bitmapscan=off; enable_seqscan=off; optimizer=off
Optimizer status: legacy query optimizer
Total runtime: 2.732 ms
(13 rows)
4、按非分布键查询,需要查询所有节点。
postgres=# explain analyze select * from tbl2 where c1=1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Gather Motion 48:1 (slice1; segments: 48) (cost=0.00..200.28 rows=1 width=8)
Rows out: 1 rows at destination with 58 ms to first row, 74 ms to end, start offset by 0.341 ms.
-> Index Scan using idx_tbl2_2 on tbl2 (cost=0.00..200.28 rows=1 width=8)
Index Cond: c1 = 1
Rows out: 1 rows (seg42) with 0.294 ms to first row, 0.297 ms to end, start offset by 58 ms.
Slice statistics:
(slice0) Executor memory: 226K bytes.
(slice1) Executor memory: 172K bytes avg x 48 workers, 172K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Settings: enable_bitmapscan=off; enable_seqscan=off; optimizer=off
Optimizer status: legacy query optimizer
Total runtime: 74.553 ms
(13 rows)
分析
在单个节点执行和在多个节点执行性能完全不一样对吧,从现象来看:
1、首先master需要生成分布式执行计划,耗费若干毫秒。
2、建立master与segment的连接,这一步可能是串行的,节点越多,建立连接耗时越久。
这个是在所有节点执行的,可以看到offset 58 ms,看起来是串行带来的问题。
Rows out: 1 rows (seg42) with 0.294 ms to first row, 0.297 ms to end, start offset by 58 ms.
3、执行,执行实际上是并行的,而且也没有花多少时间。
优化
1、如果需要经常点查,并且条件字段分布均匀的话。可以选择这个字段作为分布键,提高查询效率。(并且这个表的UK或PK也是这个列,或者没有UK或者PK列的需求时。)
2、如果不能做到按查询条件字段进行分布,参数可以优化(效果不明显)。
gp_cached_segworkers_threshold
When a user starts a session with Greenplum Database and issues a query, the system creates groups or
'gangs' of worker processes on each segment to do the work. After the work is done, the segment worker
processes are destroyed except for a cached number which is set by this parameter. A lower setting
conserves system resources on the segment hosts, but a higher setting may improve performance for
power-users that want to issue many complex queries in a row.
gp_interconnect_type
Sets the networking protocol used for Greenplum Database interconnect traffic. With the TCP protocol,
Greenplum Database has an upper limit of 1000 segment instances - less than that if the query workload
involves complex, multi-slice queries.
The UDP protocol allows for greater interconnect scalability. Note that the Greenplum Database software
does the additional packet verification and checking not performed by UDP, so reliability and performance
is equivalent to TCP.
UDPIFC specifies using UDP with flow control for interconnect traffic. Specify the interconnect flow control
method with gp_interconnect_fc_method.
Note: The Greenplum Database interconnect types TCP and UDP are deprecated. In the next
major release, only the UDPIFC interconnect type will be supported by Greenplum Database.
3、在master与segment之间使用连接池,也可能是一种优化方法。
https://greenplum.org/docs/admin_guide/access_db/topics/pgbouncer.html