PostgreSQL 11 preview - 多阶段并行聚合array_agg, string

PostgreSQL 11 preview - 多阶段并行聚合array_agg, string_agg

2018-04-18 2998

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

RDS SQL Server Serverless，2-4RCU 50GB 3个月

云数据库 RDS MySQL，集群系列 2核4GB

云原生数据库 PolarDB 分布式版，标准版 2核8GB

简介：

背景

并行聚合原理请参考：

《PostgreSQL 10 自定义并行计算聚合函数的原理与实践 - (含array_agg合并多个数组为单个一元数组的例子)》

实际上PostgreSQL支持并行计算后，聚合就分为多阶段聚合与原始的一阶段聚合两种玩法。

多阶段聚合会将聚合任务分配给所有的WORKER执行，然后再将聚合的中间结果合并。

postgres=# explain select count(id) from generate_series(1,100) id;  
                                      QUERY PLAN                                         
---------------------------------------------------------------------------------------  
 Gather  (cost=1012.50..1012.61 rows=1 width=8)  
   Workers Planned: 1  
   Single Copy: true  
   ->  Aggregate  (cost=12.50..12.51 rows=1 width=8)  
         ->  Function Scan on generate_series id  (cost=0.00..10.00 rows=1000 width=4)  
(5 rows)

而单阶段聚合，并行不可能出现在聚合节点，聚合一定要等前面节点完成所有工作后才能开始。

postgres=# explain select string_agg(id::text,',') from t_only;  
                                  QUERY PLAN                                    
------------------------------------------------------------------------------  
 Gather  (cost=230056.09..230056.20 rows=1 width=32)  
   Workers Planned: 1  
   Single Copy: true  
   ->  Aggregate  (cost=229056.09..229056.10 rows=1 width=32)  
         ->  Seq Scan on t_only  (cost=0.00..154055.62 rows=10000062 width=4)  
(5 rows)  
  
postgres=# explain select array_agg(id) from t_only;  
                                  QUERY PLAN                                    
------------------------------------------------------------------------------  
 Gather  (cost=180055.78..180055.89 rows=1 width=32)  
   Workers Planned: 1  
   Single Copy: true  
   ->  Aggregate  (cost=179055.78..179055.79 rows=1 width=32)  
         ->  Seq Scan on t_only  (cost=0.00..154055.62 rows=10000062 width=4)  
(5 rows)

多阶段聚合需要满足一个条件，即可以通过中间结果合并出最终的聚合结果，在定义聚合时，指定合并用到的combinefunction。

在MPP数据库中，为了达到高性能的效果，大部分聚合都是多阶段聚合。如果是一阶段聚合，效果是不好的。

《HybridDB PostgreSQL "Sort、Group、distinct 聚合、JOIN" 不惧怕数据倾斜的黑科技和原理 - 多阶段聚合》

《Postgres-XC customized aggregate introduction》

《Greenplum 最佳实践 - 估值插件hll的使用(以及hll分式聚合函数优化)》

在pipelinedb分布式流计算中，对于估值聚合，也使用了多阶段聚合，原因是这些估值类型都支持合并。因此能够让WORKER并行计算，最后合并WORKER的聚合中间结果，达到并行目的。

以下patch针对array_agg, string_agg实现了combinefunction，可以支持多阶段并行聚合。

Hi,  
  
While working on partial aggregation a few years ago, I didn't really  
think it was worthwhile allowing partial aggregation of string_agg and  
array_agg. I soon realised that I was wrong about that and allowing  
parallelisation of these aggregates still could be very useful when  
many rows are filtered out during the scan.  
  
Some benchmarks that I've done locally show that parallel string_agg  
and array_agg do actually perform better, despite the fact that the  
aggregate state grows linearly with each aggregated item. Obviously,  
the performance will get even better when workers are filtering out  
rows before aggregation takes place, so it seems worthwhile doing  
this. However, the main reason that I'm motivated to do this is that  
there are more uses for partial aggregation other than just parallel  
aggregation, and it seems a shame to disable all these features if a  
single aggregate does not support partial mode.  
  
I've attached a patch which implements all this. I've had most of it  
stashed away for a while now, but I managed to get some time this  
weekend to get it into a more completed state.  
  
Things are now looking pretty good for the number of aggregates that  
support partial mode.  
  
Just a handful of aggregates now don't support partial aggregation;  
  
postgres=# select aggfnoid from pg_aggregate where aggcombinefn=0 and  
aggkind='n';  
     aggfnoid  
------------------  
 xmlagg  
 json_agg  
 json_object_agg  
 jsonb_agg  
 jsonb_object_agg  
(5 rows)  
  
... and a good number do support it;  
  
postgres=# select count(*) from pg_aggregate where aggcombinefn<>0 and  
aggkind='n';  
 count  
-------  
   122  
(1 row)  
  
There's probably no reason why the last 5 of those couldn't be done  
either, it might just require shifting a bit more work into the final  
functions, although, I'm not planning on that for this patch.  
  
As for the patch; there's a bit of a quirk in the implementation of  
string_agg. We previously always threw away the delimiter that belongs  
to the first aggregated value, but we do now need to keep that around  
so we can put it in between two states in the combine function. I  
decided the path of least resistance to do this was just to borrow  
StringInfo's cursor variable to use as a pointer to the state of the  
first value and put the first delimiter before that. Both the  
string_agg(text) and string_agg(bytea) already have a final function,  
so we just need to skip over the bytes up until the cursor position to  
get rid of the first delimiter. I could go and invent some new state  
type to do the same, but I don't really see the trouble with what I've  
done with StringInfo, but I'll certainly listen if someone else thinks  
this is wrong.  
  
Another thing that I might review later about this is seeing about  
getting rid of some of the code duplication between  
array_agg_array_combine and accumArrayResultArr.  
  
I'm going to add this to PG11's final commitfest rather than the  
January 'fest as it seems more like a final commitfest type of patch.  
  
--   
 David Rowley                   http://www.2ndQuadrant.com/  
 PostgreSQL Development, 24x7 Support, Training & Services