开源数据库PostgreSQL攻克并行计算难题

本文涉及的产品
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
云数据库 RDS SQL Server,基础系列 2核4GB
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
简介: PostgreSQL 9.6的并行复制一发,相信已经有很多小伙伴已经开始测试了,我昨晚测试了一个场景是标签系统类应用的比特位运算,昨天测试发现性能相比非并行已经提升了7倍。昨天没有仔细研究代码,发现怎么测都只能用8个并行,今天看了一下代码,终于找到端倪了,其实并行度是由几个方面决定d , 决定并行.

经过多年的酝酿(从支持work process到支持动态fork共享内存,再到内核层面支持并行计算),PostgreSQL 的并行计算功能终于来了,为PG的scale up能力再次拔高一个台阶,标志着开源数据库已经攻克了并行计算的难题。


相信有很多小伙伴已经开始测试了,我也测试了一个场景是标签系统类应用的比特位运算,昨天测试发现性能相比非并行已经提升了7倍。

调整并行度,在32个核的虚拟机上测试,性能提升了约10多倍。
但是实际上并没有到32倍,不考虑内存和IO的瓶颈,是有优化空间。
注意不同的并行度,效果不一样,目前来看并不是最大并行度就能发挥最好的性能,还需要考虑锁竞争的问题。
把测试表的数据量加载到16亿,共90GB。

postgres=# \dt+
                    List of relations
 Schema |  Name  | Type  |  Owner   | Size  | Description 
--------+--------+-------+----------+-------+-------------
 public | t_bit2 | table | postgres | 90 GB | 
(1 row)

不使用并行的性能如下,耗时 141377.100 毫秒。

postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.335 ms
postgres=# select count(*) from t_bit2 ;
   count    
------------
 1600000000
(1 row)
Time: 141377.100 ms

使用17个并行,获得了最好的性能, 耗时9423.257 毫秒。

postgres=# alter table t_bit2 set (parallel_degree=17);
ALTER TABLE
Time: 0.287 ms
postgres=# select count(*) from t_bit2 ;
   count    
------------
 1600000000
(1 row)

Time: 9423.257 ms

并行度为17时,每秒处理的数据量已经达到9.55GB。
与非并行相比,性能达到了15倍,基本上是线性的。
但是可能由于NUMA的原因(并行度增加时, 读数据操作可能会引入较多的__mutex_lock_slowpath, _spin_lock),并行度再加上来性能并不能再线性提升,而是会往下走。
_
另一组测试数据,加入了BIT计算。
32个并行度时,可以获得最好的性能提升,同样也和NUMA有关,为什么并行度能更高呢,因为计算量更大了,扫描冲突可以分担掉。
同样性能比达到了30.9倍,也基本上是线性的。
_

postgres=# alter table t_bit2 set (parallel_degree=32);
ALTER TABLE
Time: 0.341 ms
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count    
------------
 1600000000
(1 row)

Time: 15836.064 ms
postgres=# alter table t_bit2 set (parallel_degree=0);
ALTER TABLE
Time: 0.368 ms
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count    
------------
 1600000000
(1 row)

Time: 488459.158 ms
postgres=# select 488459.158 /15826.358;
      ?column?       
---------------------
 30.8636489835501004
(1 row)

Time: 2.919 ms

后面会再提供tpc-h的测试数据。


那么如何设置并行度呢?决定并行度的几个参数如下
.1. 最大允许的并行度
max_parallel_degree


.2. 表设置的并行度(create table或alter table设置)
parallel_degree
如果设置了表的并行度,则最终并行度取min(max_parallel_degree , parallel_degree )

                /*
                 * Use the table parallel_degree, but don't go further than
                 * max_parallel_degree.
                 */
                parallel_degree = Min(rel->rel_parallel_degree, max_parallel_degree);


.3. 如果表没有设置并行度parallel_degree ,则根据表的大小 和 parallel_threshold 这个硬编码值决定,计算得出(见函数create_plain_partial_paths)
然后依旧受到max_parallel_degree 参数的限制,不能大于它。
代码如下

src/backend/optimizer/util/plancat.c
void
get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
                                  RelOptInfo *rel)
{
...
        /* Retrive the parallel_degree reloption, if set. */
        rel->rel_parallel_degree = RelationGetParallelDegree(relation, -1);
...


src/include/utils/rel.h
/*
 * RelationGetParallelDegree
 *              Returns the relation's parallel_degree.  Note multiple eval of argument!
 */
#define RelationGetParallelDegree(relation, defaultpd) \
        ((relation)->rd_options ? \
         ((StdRdOptions *) (relation)->rd_options)->parallel_degree : (defaultpd))


src/backend/optimizer/path/allpaths.c
/*
 * create_plain_partial_paths
 *        Build partial access paths for parallel scan of a plain relation
 */
static void
create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
{
        int                     parallel_degree = 1;

        /*
         * If the user has set the parallel_degree reloption, we decide what to do
         * based on the value of that option.  Otherwise, we estimate a value.
         */
        if (rel->rel_parallel_degree != -1)
        {
                /*
                 * If parallel_degree = 0 is set for this relation, bail out.  The
                 * user does not want a parallel path for this relation.
                 */
                if (rel->rel_parallel_degree == 0)
                        return;

                /*
                 * Use the table parallel_degree, but don't go further than
                 * max_parallel_degree.
                 */
                parallel_degree = Min(rel->rel_parallel_degree, max_parallel_degree);
        }
        else
        {
                int                     parallel_threshold = 1000;

                /*
                 * If this relation is too small to be worth a parallel scan, just
                 * return without doing anything ... unless it's an inheritance child.
                 * In that case, we want to generate a parallel path here anyway.  It
                 * might not be worthwhile just for this relation, but when combined
                 * with all of its inheritance siblings it may well pay off.
                 */
                if (rel->pages < parallel_threshold &&
                        rel->reloptkind == RELOPT_BASEREL)
                        return;
// 表级并行度没有设置时,通过表的大小和parallel_threshold 计算并行度  
                /*
                 * Limit the degree of parallelism logarithmically based on the size
                 * of the relation.  This probably needs to be a good deal more
                 * sophisticated, but we need something here for now.
                 */
                while (rel->pages > parallel_threshold * 3 &&
                           parallel_degree < max_parallel_degree)
                {
                        parallel_degree++;
                        parallel_threshold *= 3;
                        if (parallel_threshold >= PG_INT32_MAX / 3)
                                break;
                }
        }

        /* Add an unordered partial path based on a parallel sequential scan. */
        add_partial_path(rel, create_seqscan_path(root, rel, NULL, parallel_degree));
}


其他测试数据:

增加到32个并行,和硬件有关,并不一定是并行度最高时,性能就最好,前面已经分析了,一定要找到每个查询的拐点。  
postgres=# alter table t_bit2 set (parallel_degree =32);

postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
                                                                                                                                                                                                                                        QUERY
 PLAN                                                                                                                                                                                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=1551053.25..1551053.26 rows=1 width=8) (actual time=31092.551..31092.552 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=1473213
   ->  Gather  (cost=1551049.96..1551053.17 rows=32 width=8) (actual time=31060.939..31092.469 rows=33 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 32
         Workers Launched: 32
         Buffers: shared hit=1473213
         ->  Partial Aggregate  (cost=1550049.96..1550049.97 rows=1 width=8) (actual time=31047.074..31047.075 rows=1 loops=33)
               Output: PARTIAL count(*)
               Buffers: shared hit=1470589
               Worker 0: actual time=31037.287..31037.288 rows=1 loops=1
                 Buffers: shared hit=43483
               Worker 1: actual time=31035.803..31035.804 rows=1 loops=1
                 Buffers: shared hit=45112
......
               Worker 31: actual time=31055.871..31055.876 rows=1 loops=1
                 Buffers: shared hit=46439
               ->  Parallel Seq Scan on public.t_bit2  (cost=0.00..1549983.80 rows=26465 width=0) (actual time=0.040..17244.827 rows=6060606 loops=33)
                     Output: id
                     Filter: (bitand(t_bit2.id, B'1010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101
0101010101010'::"bit") = B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010'::"bit")
                     Buffers: shared hit=1470589
                     Worker 0: actual time=0.035..17314.296 rows=5913688 loops=1
                       Buffers: shared hit=43483
                     Worker 1: actual time=0.030..16965.158 rows=6135232 loops=1
                       Buffers: shared hit=45112
......
                     Worker 31: actual time=0.031..17580.908 rows=6315704 loops=1
                       Buffers: shared hit=46439
 Planning time: 0.354 ms
 Execution time: 31157.006 ms
(145 rows)

比特位运算  
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count   
-----------
 200000000
(1 row)
Time: 4320.931 ms

COUNT  
postgres=# select count(*) from t_bit2;
   count   
-----------
 200000000
(1 row)
Time: 1896.647 ms

关闭并行的查询效率    
postgres=# set force_parallel_mode =off;
SET
postgres=# alter table t_bit2 set (parallel_degree =0);
ALTER TABLE
postgres=# \timing
Timing is on.
postgres=# select count(*) from t_bit2 where bitand(id, '10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010')=B'10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010';
   count   
-----------
 200000000
(1 row)
Time: 53098.480 ms
postgres=# select count(*) from t_bit2;
   count   
-----------
 200000000
(1 row)
Time: 18504.679 ms

表大小  
postgres=# \dt+ t_bit2
                    List of relations
 Schema |  Name  | Type  |  Owner   | Size  | Description 
--------+--------+-------+----------+-------+-------------
 public | t_bit2 | table | postgres | 11 GB | 
(1 row)


参考信息
http://www.postgresql.org/docs/9.6/static/sql-createtable.html

parallel_degree (integer)
The parallel degree for a table is the number of workers that should be used to assist a parallel scan of that table. If not set, the system will determine a value based on the relation size. The actual number of workers chosen by the planner may be less, for example due to the setting of max_parallel_degree.

http://www.postgresql.org/docs/9.6/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER

force_parallel_mode (enum)
Allows the use of parallel queries for testing purposes even in cases where no performance benefit is expected. The allowed values of force_parallel_mode are off (use parallel mode only when it is expected to improve performance), on (force parallel query for all queries for which it is thought to be safe), and regress (like on, but with additional behavior changes as explained below).

More specifically, setting this value to on will add a Gather node to the top of any query plan for which this appears to be safe, so that the query runs inside of a parallel worker. Even when a parallel worker is not available or cannot be used, operations such as starting a subtransaction that would be prohibited in a parallel query context will be prohibited unless the planner believes that this will cause the query to fail. If failures or unexpected results occur when this option is set, some functions used by the query may need to be marked PARALLEL UNSAFE (or, possibly, PARALLEL RESTRICTED).

Setting this value to regress has all of the same effects as setting it to on plus some additional effects that are intended to facilitate automated regression testing. Normally, messages from a parallel worker include a context line indicating that, but a setting of regress suppresses this line so that the output is the same as in non-parallel execution. Also, the Gather nodes added to plans by this setting are hidden in EXPLAIN output so that the output matches what would be obtained if this setting were turned off.

http://www.postgresql.org/docs/9.6/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR

max_parallel_degree (integer)
Sets the maximum number of workers that can be started for an individual parallel operation. Parallel workers are taken from the pool of processes established by max_worker_processes. Note that the requested number of workers may not actually be available at runtime. If this occurs, the plan will run with fewer workers than expected, which may be inefficient. The default value is 2. Setting this value to 0 disables parallel query execution.

http://www.postgresql.org/docs/9.6/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS

parallel_setup_cost (floating point)
Sets the planner's estimate of the cost of launching parallel worker processes. The default is 1000.
parallel_tuple_cost (floating point)
Sets the planner's estimate of the cost of transferring one tuple from a parallel worker process to another process. The default is 0.1.
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
19天前
|
NoSQL 关系型数据库 MySQL
微服务架构下的数据库选择:MySQL、PostgreSQL 还是 NoSQL?
在微服务架构中,数据库的选择至关重要。不同类型的数据库适用于不同的需求和场景。在本文章中,我们将深入探讨传统的关系型数据库(如 MySQL 和 PostgreSQL)与现代 NoSQL 数据库的优劣势,并分析在微服务架构下的最佳实践。
|
12天前
|
SQL 关系型数据库 数据库
PostgreSQL数据库报错 ERROR: multiple default values specified for column "" of table "" 如何解决?
PostgreSQL数据库报错 ERROR: multiple default values specified for column "" of table "" 如何解决?
167 59
|
6天前
|
Oracle NoSQL 关系型数据库
主流数据库对比:MySQL、PostgreSQL、Oracle和Redis的优缺点分析
主流数据库对比:MySQL、PostgreSQL、Oracle和Redis的优缺点分析
17 2
|
16天前
|
缓存 关系型数据库 数据库
如何优化 PostgreSQL 数据库性能?
如何优化 PostgreSQL 数据库性能?
19 2
|
1月前
|
关系型数据库 分布式数据库 数据库
开源云原生数据库PolarDB PostgreSQL 15兼容版本正式发布
PolarDB进行了深度的内核优化,从而实现以更低的成本提供商业数据库的性能。
|
2月前
|
SQL Oracle 关系型数据库
DBeaver,一款好用的开源数据库管理软件
DBeaver,一款好用的开源数据库管理软件
|
28天前
|
关系型数据库 数据库 网络虚拟化
Docker环境下重启PostgreSQL数据库服务的全面指南与代码示例
由于时间和空间限制,我将在后续的回答中分别涉及到“Python中采用lasso、SCAD、LARS技术分析棒球运动员薪资的案例集锦”以及“Docker环境下重启PostgreSQL数据库服务的全面指南与代码示例”。如果你有任何一个问题的优先顺序或需要立即回答的,请告知。
47 0
|
2月前
|
SQL 关系型数据库 MySQL
SQL Server、MySQL、PostgreSQL:主流数据库SQL语法异同比较——深入探讨数据类型、分页查询、表创建与数据插入、函数和索引等关键语法差异,为跨数据库开发提供实用指导
【8月更文挑战第31天】SQL Server、MySQL和PostgreSQL是当今最流行的关系型数据库管理系统,均使用SQL作为查询语言,但在语法和功能实现上存在差异。本文将比较它们在数据类型、分页查询、创建和插入数据以及函数和索引等方面的异同,帮助开发者更好地理解和使用这些数据库。尽管它们共用SQL语言,但每个系统都有独特的语法规则,了解这些差异有助于提升开发效率和项目成功率。
133 0
|
关系型数据库 分布式数据库 PolarDB
《阿里云产品手册2022-2023 版》——PolarDB for PostgreSQL
《阿里云产品手册2022-2023 版》——PolarDB for PostgreSQL
348 0
|
存储 缓存 关系型数据库

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版
  • 下一篇
    无影云桌面