PostgreSQL 异步IO实测

本文涉及的产品
云数据库 RDS SQL Server,基础系列 2核4GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
RDS PostgreSQL Serverless,0.5-4RCU 50GB 3个月
推荐场景:
对影评进行热评分析
简介:

标签

PostgreSQL , effective_io_concurrency , 异步IO


背景

异步IO的目的是充分发挥块设备的吞吐能力,让块设备处于更繁忙的工作状态(一次连续摄取更多的块),而不是等用户进程需要数据时再读取。

如果数据库并发连接(或者活跃会话)足够时,并且块设备处于繁忙状态,那么没有必要开启异步IO,因为开了也没什么用,块设备已经足够的忙了。

目前PostgreSQL的bitmap heap scan支持异步IO,因为bitmap heap scan是按顺序读取堆表的数据块的,对于机械硬盘,bitmap heap scan异步IO效率可以得到充分的发挥。(实际上全表扫描也适合异步IO。)

异步IO的参数effective_io_concurrency,应该如何设置呢?

如果是磁盘阵列,根据表空间所在的块设备进行设置,例如RAID0, RAID10,设置为磁盘个数,而RAID5或者其他RAID,设置为实际的数据盘个数(如,raid5(10)设置为9)。

如果设置为0,表示不使用异步IO。

仅仅当操作系统支持posix时,才能使用异步IO。

effective_io_concurrency (integer)  
  
Sets the number of concurrent disk I/O operations that PostgreSQL expects can be executed simultaneously.   
Raising this value will increase the number of I/O operations that any individual PostgreSQL session attempts to initiate in parallel.   
The allowed range is 1 to 1000, or zero to disable issuance of asynchronous I/O requests. Currently, this setting only affects bitmap heap scans.  
  
For magnetic drives, a good starting point for this setting is the number of separate drives comprising a RAID 0 stripe or RAID 1 mirror being used for the database.   
(For RAID 5 the parity drive should not be counted.)   
However, if the database is often busy with multiple queries issued in concurrent sessions, lower values may be sufficient to keep the disk array busy.   
A value higher than needed to keep the disks busy will only result in extra CPU overhead.   
SSDs and other memory-based storage can often process many concurrent requests, so the best value might be in the hundreds.  
  
Asynchronous I/O depends on an effective posix_fadvise function, which some operating systems lack.   
If the function is not present then setting this parameter to anything but zero will result in an error.   
On some operating systems (e.g., Solaris), the function is present but does not actually do anything.  
  
The default is 1 on supported systems, otherwise 0.   
This value can be overridden for tables in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE).  

那么生产中应该如何设置呢?

我们实际测试一下,如果是SSD,从测试结果看,建议关闭异步IO。(虽然手册中提到如果是内存盘或SSD盘,可以设置更大的值,但是实测效果不理想。)

测试用例

src/test/regress/expected/select_parallel.out

-- test parallel bitmap heap scan.  
set enable_seqscan to off;  
set enable_indexscan to off;  
set enable_hashjoin to off;  
set enable_mergejoin to off;  
set enable_material to off;  
-- test prefetching, if the platform allows it  
DO $$  
BEGIN  
 SET effective_io_concurrency = 50;  
EXCEPTION WHEN invalid_parameter_value THEN  
END $$;  
set work_mem='64kB';  --set small work mem to force lossy pages  
explain (costs off)  
        select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;  
                         QUERY PLAN                           
------------------------------------------------------------  
 Aggregate  
   ->  Nested Loop  
         ->  Seq Scan on tenk2  
               Filter: (thousand = 0)  
         ->  Gather  
               Workers Planned: 4  
               ->  Parallel Bitmap Heap Scan on tenk1  
                     Recheck Cond: (hundred > 1)  
                     ->  Bitmap Index Scan on tenk1_hundred  
                           Index Cond: (hundred > 1)  
(10 rows)  
  
select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;  
 count   
-------  
 98000  
(1 row)  
  
create table bmscantest (a int, t text);  
insert into bmscantest select r, 'fooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo' FROM generate_series(1,100000) r;  
create index i_bmtest ON bmscantest(a);  
select count(*) from bmscantest where a>1;  
 count   
-------  
 99999  
(1 row)  
  
reset enable_seqscan;  
reset enable_indexscan;  
reset enable_hashjoin;  
reset enable_mergejoin;  
reset enable_material;  
reset effective_io_concurrency;  
reset work_mem;  
drop table bmscantest;  

测试

postgres=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          42328  
(1 row)  
  
postgres=# set enable_seqscan=off;  
SET  
postgres=# set enable_indexscan=off;  
SET  
postgres=# set work_mem ='64kB';  
SET  
postgres=# set max_parallel_workers_per_gather =0;  
SET  
  
  
postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from ptest where id>1000 ;  
                                                                  QUERY PLAN                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=338227.11..338227.12 rows=1 width=8) (actual time=2634.805..2634.806 rows=1 loops=1)  
   Output: count(*)  
   Buffers: shared hit=110649  
   ->  Bitmap Heap Scan on public.ptest  (cost=104910.47..313230.01 rows=9998843 width=0) (actual time=418.257..1735.802 rows=9999000 loops=1)  
         Output: id, info  
         Recheck Cond: (ptest.id > 1000)  
         Rows Removed by Index Recheck: 40  
         Heap Blocks: exact=408 lossy=82918  
         Buffers: shared hit=110649  
         ->  Bitmap Index Scan on idx_ptest1  (cost=0.00..102410.76 rows=9998843 width=0) (actual time=418.158..418.158 rows=9999000 loops=1)  
               Index Cond: (ptest.id > 1000)  
               Buffers: shared hit=27323  
 Planning time: 0.123 ms  
 Execution time: 2634.858 ms  
(14 rows)  
  
postgres=# set effective_io_concurrency =0;  
SET  
postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from ptest where id>1000 ;  
                                                                  QUERY PLAN                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=338227.11..338227.12 rows=1 width=8) (actual time=2619.868..2619.869 rows=1 loops=1)  
   Output: count(*)  
   Buffers: shared hit=110649  
   ->  Bitmap Heap Scan on public.ptest  (cost=104910.47..313230.01 rows=9998843 width=0) (actual time=418.974..1721.578 rows=9999000 loops=1)  
         Output: id, info  
         Recheck Cond: (ptest.id > 1000)  
         Rows Removed by Index Recheck: 40  
         Heap Blocks: exact=408 lossy=82918  
         Buffers: shared hit=110649  
         ->  Bitmap Index Scan on idx_ptest1  (cost=0.00..102410.76 rows=9998843 width=0) (actual time=418.873..418.873 rows=9999000 loops=1)  
               Index Cond: (ptest.id > 1000)  
               Buffers: shared hit=27323  
 Planning time: 0.155 ms  
 Execution time: 2619.930 ms  
(14 rows)  
  
postgres=# set effective_io_concurrency =1000;  
SET  
postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from ptest where id>1000 ;  
                                                                  QUERY PLAN                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=338227.11..338227.12 rows=1 width=8) (actual time=2661.584..2661.585 rows=1 loops=1)  
   Output: count(*)  
   Buffers: shared hit=110649  
   ->  Bitmap Heap Scan on public.ptest  (cost=104910.47..313230.01 rows=9998843 width=0) (actual time=417.591..1761.522 rows=9999000 loops=1)  
         Output: id, info  
         Recheck Cond: (ptest.id > 1000)  
         Rows Removed by Index Recheck: 40  
         Heap Blocks: exact=408 lossy=82918  
         Buffers: shared hit=110649  
         ->  Bitmap Index Scan on idx_ptest1  (cost=0.00..102410.76 rows=9998843 width=0) (actual time=417.473..417.473 rows=9999000 loops=1)  
               Index Cond: (ptest.id > 1000)  
               Buffers: shared hit=27323  
 Planning time: 0.113 ms  
 Execution time: 2661.632 ms  
(14 rows)  
  
  
postgres=# set max_parallel_workers_per_gather =4;  
SET  
postgres=# set effective_io_concurrency =1000;  
SET  
postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from ptest where id>1000 ;  
                                                                            QUERY PLAN                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Finalize Aggregate  (cost=226740.55..226740.56 rows=1 width=8) (actual time=1017.452..1017.452 rows=1 loops=1)  
   Output: count(*)  
   Buffers: shared hit=111309  
   ->  Gather  (cost=226740.13..226740.54 rows=4 width=8) (actual time=1017.350..1017.447 rows=5 loops=1)  
         Output: (PARTIAL count(*))  
         Workers Planned: 4  
         Workers Launched: 4  
         Buffers: shared hit=111309  
         ->  Partial Aggregate  (cost=225740.13..225740.14 rows=1 width=8) (actual time=1014.227..1014.227 rows=1 loops=5)  
               Output: PARTIAL count(*)  
               Buffers: shared hit=110649  
               Worker 0: actual time=1013.492..1013.492 rows=1 loops=1  
                 Buffers: shared hit=16355  
               Worker 1: actual time=1013.497..1013.497 rows=1 loops=1  
                 Buffers: shared hit=16269  
               Worker 2: actual time=1013.510..1013.510 rows=1 loops=1  
                 Buffers: shared hit=16333  
               Worker 3: actual time=1013.501..1013.501 rows=1 loops=1  
                 Buffers: shared hit=16268  
               ->  Parallel Bitmap Heap Scan on public.ptest  (cost=104910.47..219490.85 rows=2499711 width=0) (actual time=418.784..840.764 rows=1999800 loops=5)  
                     Recheck Cond: (ptest.id > 1000)  
                     Rows Removed by Index Recheck: 8  
                     Heap Blocks: exact=92 lossy=18009  
                     Buffers: shared hit=110649  
                     Worker 0: actual time=417.217..844.848 rows=1962600 loops=1  
                       Buffers: shared hit=16355  
                     Worker 1: actual time=422.223..844.926 rows=1952280 loops=1  
                       Buffers: shared hit=16269  
                     Worker 2: actual time=417.216..844.028 rows=1959960 loops=1  
                       Buffers: shared hit=16333  
                     Worker 3: actual time=417.216..840.404 rows=1952080 loops=1  
                       Buffers: shared hit=16268  
                     ->  Bitmap Index Scan on idx_ptest1  (cost=0.00..102410.76 rows=9998843 width=0) (actual time=419.867..419.867 rows=9999000 loops=1)  
                           Index Cond: (ptest.id > 1000)  
                           Buffers: shared hit=27323  
 Planning time: 0.121 ms  
 Execution time: 1027.083 ms  
(37 rows)  
postgres=# set effective_io_concurrency =1;  
SET  
postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from ptest where id>1000 ;  
                                                                            QUERY PLAN                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Finalize Aggregate  (cost=226740.55..226740.56 rows=1 width=8) (actual time=1006.182..1006.182 rows=1 loops=1)  
   Output: count(*)  
   Buffers: shared hit=111309  
   ->  Gather  (cost=226740.13..226740.54 rows=4 width=8) (actual time=1006.073..1006.177 rows=5 loops=1)  
         Output: (PARTIAL count(*))  
         Workers Planned: 4  
         Workers Launched: 4  
         Buffers: shared hit=111309  
         ->  Partial Aggregate  (cost=225740.13..225740.14 rows=1 width=8) (actual time=1003.124..1003.124 rows=1 loops=5)  
               Output: PARTIAL count(*)  
               Buffers: shared hit=110649  
               Worker 0: actual time=1002.447..1002.447 rows=1 loops=1  
                 Buffers: shared hit=16408  
               Worker 1: actual time=1002.441..1002.441 rows=1 loops=1  
                 Buffers: shared hit=16312  
               Worker 2: actual time=1002.446..1002.446 rows=1 loops=1  
                 Buffers: shared hit=16093  
               Worker 3: actual time=1002.436..1002.436 rows=1 loops=1  
                 Buffers: shared hit=16287  
               ->  Parallel Bitmap Heap Scan on public.ptest  (cost=104910.47..219490.85 rows=2499711 width=0) (actual time=413.650..828.524 rows=1999800 loops=5)  
                     Recheck Cond: (ptest.id > 1000)  
                     Rows Removed by Index Recheck: 8  
                     Heap Blocks: exact=85 lossy=18141  
                     Buffers: shared hit=110649  
                     Worker 0: actual time=412.974..833.142 rows=1968960 loops=1  
                       Buffers: shared hit=16408  
                     Worker 1: actual time=412.994..831.530 rows=1957440 loops=1  
                       Buffers: shared hit=16312  
                     Worker 2: actual time=412.980..831.353 rows=1931160 loops=1  
                       Buffers: shared hit=16093  
                     Worker 3: actual time=412.977..828.009 rows=1954440 loops=1  
                       Buffers: shared hit=16287  
                     ->  Bitmap Index Scan on idx_ptest1  (cost=0.00..102410.76 rows=9998843 width=0) (actual time=416.164..416.164 rows=9999000 loops=1)  
                           Index Cond: (ptest.id > 1000)  
                           Buffers: shared hit=27323  
 Planning time: 0.124 ms  
 Execution time: 1015.976 ms  
(37 rows)  
  
postgres=# set effective_io_concurrency =0;  
SET  
postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from ptest where id>1000 ;  
                                                                            QUERY PLAN                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Finalize Aggregate  (cost=226740.55..226740.56 rows=1 width=8) (actual time=943.288..943.288 rows=1 loops=1)  
   Output: count(*)  
   Buffers: shared hit=111309  
   ->  Gather  (cost=226740.13..226740.54 rows=4 width=8) (actual time=943.170..943.282 rows=5 loops=1)  
         Output: (PARTIAL count(*))  
         Workers Planned: 4  
         Workers Launched: 4  
         Buffers: shared hit=111309  
         ->  Partial Aggregate  (cost=225740.13..225740.14 rows=1 width=8) (actual time=940.060..940.060 rows=1 loops=5)  
               Output: PARTIAL count(*)  
               Buffers: shared hit=110649  
               Worker 0: actual time=939.276..939.276 rows=1 loops=1  
                 Buffers: shared hit=16286  
               Worker 1: actual time=939.387..939.388 rows=1 loops=1  
                 Buffers: shared hit=16303  
               Worker 2: actual time=939.389..939.390 rows=1 loops=1  
                 Buffers: shared hit=16291  
               Worker 3: actual time=939.388..939.389 rows=1 loops=1  
                 Buffers: shared hit=15943  
               ->  Parallel Bitmap Heap Scan on public.ptest  (cost=104910.47..219490.85 rows=2499711 width=0) (actual time=419.160..766.586 rows=1999800 loops=5)  
                     Recheck Cond: (ptest.id > 1000)  
                     Rows Removed by Index Recheck: 8  
                     Heap Blocks: exact=92 lossy=18411  
                     Buffers: shared hit=110649  
                     Worker 0: actual time=418.404..771.100 rows=1954320 loops=1  
                       Buffers: shared hit=16286  
                     Worker 1: actual time=418.487..770.127 rows=1956360 loops=1  
                       Buffers: shared hit=16303  
                     Worker 2: actual time=418.491..770.200 rows=1954920 loops=1  
                       Buffers: shared hit=16291  
                     Worker 3: actual time=418.513..768.309 rows=1913160 loops=1  
                       Buffers: shared hit=15943  
                     ->  Bitmap Index Scan on idx_ptest1  (cost=0.00..102410.76 rows=9998843 width=0) (actual time=421.750..421.750 rows=9999000 loops=1)  
                           Index Cond: (ptest.id > 1000)  
                           Buffers: shared hit=27323  
 Planning time: 0.169 ms  
 Execution time: 952.028 ms  
(37 rows)  

分别使用perf record -ag -p $pid 进行了跟踪。因为这个版本编译时没有打开几个需要的开关(-g -ggdb -fno-omit-frame-pointer),所以看不出什么问题。用户可以自己测试并分析一下。

参考

《PostgreSQL on Linux 最佳部署手册》

《Linux 性能诊断 perf使用指南》

《PostgreSQL 源码性能诊断(perf profiling)指南》

https://www.postgresql.org/docs/devel/static/runtime-config-resource.html

src/backend/executor/nodeBitmapHeapscan.c

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
2月前
|
并行计算 数据处理 Python
Python并发编程迷雾:IO密集型为何偏爱异步?CPU密集型又该如何应对?
在Python的并发编程世界中,没有万能的解决方案,只有最适合特定场景的方法。希望本文能够为你拨开迷雾,找到那条通往高效并发编程的光明大道。
42 2
|
3月前
|
开发框架 并行计算 算法
揭秘Python并发神器:IO密集型与CPU密集型任务的异步革命,你竟还傻傻分不清?
揭秘Python并发神器:IO密集型与CPU密集型任务的异步革命,你竟还傻傻分不清?
46 4
|
7月前
|
调度 数据库 Python
【专栏】异步IO在处理IO密集型任务中的高效性
【4月更文挑战第27天】本文介绍了Python并发编程和异步IO,包括并发的基本概念(多线程、多进程、协程),线程与进程的实现(threading和multiprocessing模块),协程的使用(asyncio模块),以及异步IO的原理和优势。强调了异步IO在处理IO密集型任务中的高效性,指出应根据任务类型选择合适的并发技术。
162 2
|
3月前
|
算法 Java 程序员
解锁Python高效之道:并发与异步在IO与CPU密集型任务中的精准打击策略!
在数据驱动时代,高效处理大规模数据和高并发请求至关重要。Python凭借其优雅的语法和强大的库支持,成为开发者首选。本文将介绍Python中的并发与异步编程,涵盖并发与异步的基本概念、IO密集型任务的并发策略、CPU密集型任务的并发策略以及异步IO的应用。通过具体示例,展示如何使用`concurrent.futures`、`asyncio`和`multiprocessing`等库提升程序性能,帮助开发者构建高效、可扩展的应用程序。
117 0
|
5月前
|
并行计算 数据处理 Python
Python并发编程迷雾:IO密集型为何偏爱异步?CPU密集型又该如何应对?
【7月更文挑战第17天】Python并发编程中,异步编程(如`asyncio`)在IO密集型任务中提高效率,利用等待时间执行其他任务。但对CPU密集型任务,由于GIL限制,多线程效率不高,此时应选用`multiprocessing`进行多进程并行计算以突破限制。选择合适的并发策略是关键:异步适合IO,多进程适合CPU。理解这些能帮助构建高效并发程序。
122 6
|
5月前
|
算法 Java 程序员
解锁Python高效之道:并发与异步在IO与CPU密集型任务中的精准打击策略!
【7月更文挑战第17天】在数据驱动时代,Python凭借其优雅语法和强大库支持成为并发处理大规模数据的首选。并发与异步编程是关键,包括多线程、多进程和异步IO。对于IO密集型任务,如网络请求,可使用`concurrent.futures`和`asyncio`;CPU密集型任务则推荐多进程,如`multiprocessing`;`asyncio`适用于混合任务,实现等待IO时执行CPU任务。通过这些工具,开发者能有效优化资源,提升系统性能。
94 4
|
5月前
|
开发框架 并行计算 .NET
从菜鸟到大神:Python并发编程深度剖析,IO与CPU的异步战争!
【7月更文挑战第18天】Python并发涉及多线程、多进程和异步IO(asyncio)。异步IO适合IO密集型任务,如并发HTTP请求,能避免等待提高效率。多进程在CPU密集型任务中更优,因可绕过GIL限制实现并行计算。通过正确选择并发策略,开发者能提升应用性能和响应速度。
108 3
|
5月前
|
开发框架 并行计算 算法
揭秘Python并发神器:IO密集型与CPU密集型任务的异步革命,你竟还傻傻分不清?
【7月更文挑战第18天】Python并发编程中,异步IO适合IO密集型任务,如异步HTTP请求,利用`asyncio`和`aiohttp`实现并发抓取,避免等待延迟。而对于CPU密集型任务,如并行计算斐波那契数列,多进程通过`multiprocessing`库能绕过GIL限制实现并行计算。选择正确的并发模型能显著提升性能。
88 2
|
5月前
|
开发框架 数据挖掘 .NET
显微镜下的Python并发:细说IO与CPU密集型任务的异步差异,助你精准施策!
【7月更文挑战第16天】在Python并发编程中,理解和区分IO密集型与CPU密集型任务至关重要。IO密集型任务(如网络请求)适合使用异步编程(如`asyncio`),以利用等待时间执行其他任务,提高效率。CPU密集型任务(如计算)则推荐使用多进程(如`multiprocessing`),绕过GIL限制,利用多核CPU。正确选择并发策略能优化应用性能。
71 2
|
5月前
|
SQL 关系型数据库 MySQL
实时计算 Flink版产品使用问题之在Flink算子内部使用异步IO可以通过什么办法实现
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStream API、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版