PostgreSQL中1000分区的继承分区表更新失败的原因

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
简介: 之前的一篇博客《PostgreSQL分区表的性能损耗验证》中,遇到1000分区的分区表更新和删除都执行失败(问题3)的问题。经过简单的调查发现原因竟然是OOM导致进程被杀。以下是相关错误消息 1.
之前的一篇博客 《PostgreSQL分区表的性能损耗验证》中,遇到1000分区的分区表更新和删除都执行失败(问题3)的问题。经过简单的调查发现原因竟然是OOM导致进程被杀。以下是相关错误消息

1.错误现象

命令行( SQL执行失败):

点击(此处)折叠或打开

  1. -bash-4.1$ pgbench -n -r -c 10 -j 10 -T 2 -f update.sql db1000
  2. Client 2 aborted in state 1. Probably the backend died while processing.
  3. WARNING: terminating connection because of crash of another server process
  4. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  5. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  6. WARNING: terminating connection because of crash of another server process
  7. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  8. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  9. WARNING: terminating connection because of crash of another server process
  10. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  11. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  12. WARNING: terminating connection because of crash of another server process
  13. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  14. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  15. WARNING: terminating connection because of crash of another server process
  16. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  17. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  18. WARNING: terminating connection because of crash of another server process
  19. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  20. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  21. WARNING: terminating connection because of crash of another server process
  22. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  23. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  24. WARNING: terminating connection because of crash of another server process
  25. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  26. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  27. WARNING: terminating connection because of crash of another server process
  28. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  29. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  30. Client 3 aborted in state 1. Probably the backend died while processing.
  31. Client 0 aborted in state 1. Probably the backend died while processing.
  32. Client 9 aborted in state 1. Probably the backend died while processing.
  33. Client 8 aborted in state 1. Probably the backend died while processing.
  34. Client 7 aborted in state 1. Probably the backend died while processing.
  35. Client 1 aborted in state 1. Probably the backend died while processing.
  36. Client 4 aborted in state 1. Probably the backend died while processing.
  37. Client 6 aborted in state 1. Probably the backend died while processing.
  38. Client 5 aborted in state 1. Probably the backend died while processing.
  39. transaction type: Custom query
  40. scaling factor: 1
  41. query mode: simple
  42. number of clients: 10
  43. number of threads: 10
  44. duration: 2 s
  45. number of transactions actually processed: 0
  46. tps = 0.000000 (including connections establishing)
  47. tps = 0.000000 (excluding connections establishing)
  48. statement latencies in milliseconds:
  49.     0.271800    \setrandom id 1 10000000
  50.     0.000000    update maintb set name = 'aaaaa12345' where id=:id;

pg服务端日志(服务端进程被杀并重启):

点击(此处)折叠或打开

  1. LOG: server process (PID 18565) was terminated by signal 9: Killed
  2. DETAIL: Failed process was running: update maintb set name = 'aaaaa12345' where id=150458;
  3. LOG: terminating any other active server processes
  4. WARNING: terminating connection because of crash of another server process
  5. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  6. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  7. WARNING: terminating connection because of crash of another server process
  8. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  9. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  10. WARNING: terminating connection because of crash of another server process
  11. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  12. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  13. WARNING: terminating connection because of crash of another server process
  14. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  15. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  16. WARNING: terminating connection because of crash of another server process
  17. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  18. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  19. WARNING: terminating connection because of crash of another server process
  20. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  21. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  22. WARNING: terminating connection because of crash of another server process
  23. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  24. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  25. WARNING: terminating connection because of crash of another server process
  26. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  27. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  28. WARNING: terminating connection because of crash of another server process
  29. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  30. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  31. WARNING: terminating connection because of crash of another server process
  32. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  33. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  34. WARNING: terminating connection because of crash of another server process
  35. DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
  36. HINT: In a moment you should be able to reconnect to the database and repeat your command.
  37. LOG: all server processes terminated; reinitializing
  38. LOG: database system was interrupted; last known up at 2015-01-16 14:18:26 CST
  39. LOG: database system was not properly shut down; automatic recovery in progress
  40. LOG: redo starts at 3/4A153AC8
  41. LOG: record with zero length at 3/4AD01498
  42. LOG: redo done at 3/4AD01468
  43. LOG: last completed transaction was at log time 2015-01-16 14:35:11.990522+08
  44. LOG: database system is ready to accept connections
  45. LOG: autovacuum launcher started

/var/log/messages(OOM):

点击(此处)折叠或打开

  1. Jan 16 14:35:40 hanode1 kernel: Out of memory: Kill process 18565 (postgres) score 94 or sacrifice child
  2. Jan 16 14:35:40 hanode1 kernel: Killed process 18565, UID 26, (postgres) total-vm:608864kB, anon-rss:150480kB, file-rss:3180kB

2.PG进程究竟分配了多少内存?

通过debug PG9.4的代码,发现生成分区表的1个update语句的执行计划居然耗费了717M的内存。

sql:

点击(此处)折叠或打开

  1. db1000=# update maintb set name='aaaaa12345' where id =3;
  2. UPDATE 19

src/backend/tcop/postgres.c:

点击(此处)折叠或打开

  1. /*
  2.          * OK to analyze, rewrite, and plan this query.
  3.          *
  4.          * Switch to appropriate context for constructing querytrees (again,
  5.          * these must outlive the execution context).
  6.          */
  7.         oldcontext = MemoryContextSwitchTo(MessageContext);

  8.         querytree_list = pg_analyze_and_rewrite(parsetree, query_string,
  9.                                                 NULL, 0);

  10.         plantree_list = pg_plan_queries(querytree_list, 0, NULL);

  11.         /* Done with the snapshot used for parsing/planning */
  12.         if (snapshot_set)
  13.             PopActiveSnapshot();

在上面的pg_plan_queries()调用前后分别在gdb中执行“call MemoryContextStats(MessageContext)”,从输出结果可以看出 pg_plan_queries分配了717M内存。

点击(此处)折叠或打开

  1. MessageContext: 16384 total in 2 blocks; 9976 free (9 chunks); 6408 used
  2. MessageContext: 721420288 total in 96 blocks; 3715776 free (13 chunks); 717704512 used

1个客户端700多M ,10个客户端 并行执行就是7个G ,我的VM只有2G 内存,难怪被kill掉


3.问题在哪里?

同样的方法,测量了另外两种情况的内存消耗。
100个分区的update
sql:

点击(此处)折叠或打开

  1. db100=# update maintb set name='aaaaa12345' where id =3;
  2. UPDATE 8

生成执行计划消耗的内存:8M

点击(此处)折叠或打开

  1. MessageContext: 24576 total in 2 blocks; 18168 free (9 chunks); 6408 used
  2. MessageContext: 8380416 total in 10 blocks; 5056 free (3 chunks); 8375360 used

从8M到700M,不难看出:分区表update的执行计划消耗内存与分区数目n是n^2的倍数关系。

进一步细化,发现关键点在下面两处内存分配。
src/backend/optimizer/plan/planner.c

点击(此处)折叠或打开

  1. static Plan *
  2. inheritance_planner(PlannerInfo *root)
  3. {
  4. ...
  5.     foreach(lc, root->append_rel_list)//循环1001次
  6.     {
  7.         AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
  8.         PlannerInfo subroot;
  9.         Plan     *subplan;
  10.         Index        rti;

  11.         /* append_rel_list contains all append rels; ignore others */
  12.         if (appinfo->parent_relid != parentRTindex)
  13.             continue;

  14.         /*
  15.          * We need a working copy of the PlannerInfo so that we can control
  16.          * propagation of information back to the main copy.
  17.          */
  18.         memcpy(&subroot, root, sizeof(PlannerInfo));

  19.         /*
  20.          * Generate modified query with this rel as target. We first apply
  21.          * adjust_appendrel_attrs, which copies the Query and changes
  22.          * references to the parent RTE to refer to the current child RTE,
  23.          * then fool around with subquery RTEs.
  24.          */
  25.         subroot.parse = (Query *)
  26.             adjust_appendrel_attrs(root,
  27.                                  (Node *) parse,
  28.                                  appinfo);//大约分配了300K

  29.         /*
  30.          * The rowMarks list might contain references to subquery RTEs, so
  31.          * make a copy that we can apply ChangeVarNodes to. (Fortunately, the
  32.          * executor doesn't need to see the modified copies --- we can just
  33.          * pass it the original rowMarks list.)
  34.          */
  35.         subroot.rowMarks = (List *) copyObject(root->rowMarks);

  36.         /*
  37.          * The append_rel_list likewise might contain references to subquery
  38.          * RTEs (if any subqueries were flattenable UNION ALLs). So prepare
  39.          * to apply ChangeVarNodes to that, too.
  40.          */
  41.         subroot.append_rel_list = (List *) copyObject(root->append_rel_list);//大约分配了400K

  42.         /*
  43.          * Add placeholders to the child Query's rangetable list to fill the
  44.          * RT indexes already reserved for subqueries in previous children.
  45.          * These won't be referenced, so there's no need to make them very
  46.          * valid-looking.
  47.          */
  48.         while (list_length(subroot.parse->rtable) list_length(final_rtable))
  49.             subroot.parse->rtable = lappend(subroot.parse->rtable,
  50.                                             makeNode(RangeTblEntry));
  51. ...
  52. }

4. select为什么没有问题?

select和update走的是不同的分支,没有走上面的inheritance_planner()函数,内存消耗只有7M。

1000个分区的select
sql:

点击(此处)折叠或打开

  1. db1000=# select * from maintb where id =3;
  2.  id | name
  3. ----+------------
  4.   3 | aaaaa12345
  5.   3 | aaaaa12345
  6.   3 | aaaaa12345
  7.   3 | aaaaa12345
  8.   3 | aaaaa12345
  9.   3 | aaaaa12345
  10.   3 | aaaaa12345
  11.   3 | aaaaa12345
  12.   3 | aaaaa12345
  13.   3 | aaaaa12345
  14.   3 | aaaaa12345
  15.   3 | aaaaa12345
  16.   3 | aaaaa12345
  17.   3 | aaaaa12345
  18.   3 | aaaaa12345
  19.   3 | aaaaa12345
  20.   3 | aaaaa12345
  21.   3 | aaaaa12345
  22.   3 | aaaaa12345
  23. (19 rows)

生成 执行计划消耗的内存:7M

点击(此处)折叠或打开

  1. MessageContext: 8192 total in 1 blocks; 1864 free (1 chunks); 6328 used
  2. MessageContext: 8388608 total in 11 blocks; 1176776 free (6 chunks); 7211832 used

update的调用栈:
点击( 此处 )折叠或打开
  1. (gdb) bt
  2. #0 inheritance_planner (root=0x1fd6548) at planner.c:787
  3. #1 0x00000000005fe7c0 in subquery_planner (glob=0x1fd64b8, parse=0x1fd59e0, parent_root=value optimized out>,
  4.     hasRecursion=value optimized out>, tuple_fraction=0, subroot=0x7fffbf4c70b8) at planner.c:569
  5. #2 0x00000000005fe935 in standard_planner (parse=0x1fd59e0, cursorOptions=0, boundParams=0x0) at planner.c:210
  6. #3 0x00000000006784fa in pg_plan_query (querytree=value optimized out>, cursorOptions=value optimized out>,
  7.     boundParams=value optimized out>) at postgres.c:750
  8. #4 0x00000000006785e4 in pg_plan_queries (querytrees=value optimized out>, cursorOptions=0, boundParams=0x0)
  9.     at postgres.c:809
  10. #5 0x0000000000678bcc in exec_simple_query (
  11.     query_string=0x1fd4b30 "update maintb set name='aaaaa12345' where id =3;") at postgres.c:974
  12. #6 0x000000000067a179 in PostgresMain (argc=value optimized out>, argv=value optimized out>,
  13.     dbname=0x1f70fb0 "db1000", username=value optimized out>) at postgres.c:4016
  14. #7 0x00000000006283cb in BackendRun (argc=value optimized out>, argv=value optimized out>) at postmaster.c:4123
  15. #8 BackendStartup (argc=value optimized out>, argv=value optimized out>) at postmaster.c:3797
  16. #9 ServerLoop (argc=value optimized out>, argv=value optimized out>) at postmaster.c:1576
  17. #10 PostmasterMain (argc=value optimized out>, argv=value optimized out>) at postmaster.c:1223
  18. #11 0x00000000005c0cd8 in main (argc=3, argv=0x1f6fa10) at main.c:227

select的调用栈:

点击(此处)折叠或打开

  1. (gdb) bt
  2. #0 grouping_planner (root=0x1fd6568, tuple_fraction=0) at planner.c:1080
  3. #1 0x00000000005fe5b4 in subquery_planner (glob=0x1fd64d8, parse=0x1fd5b20, parent_root=value optimized out>,
  4.     hasRecursion=value optimized out>, tuple_fraction=0, subroot=0x7fffbf4c70b8) at planner.c:572
  5. #2 0x00000000005fe935 in standard_planner (parse=0x1fd5b20, cursorOptions=0, boundParams=0x0) at planner.c:210
  6. #3 0x00000000006784fa in pg_plan_query (querytree=value optimized out>, cursorOptions=value optimized out>,
  7.     boundParams=value optimized out>) at postgres.c:750
  8. #4 0x00000000006785e4 in pg_plan_queries (querytrees=value optimized out>, cursorOptions=0, boundParams=0x0)
  9.     at postgres.c:809
  10. #5 0x0000000000678bcc in exec_simple_query (query_string=0x1fd4b30 "select * from maintb where id =3;")
  11.     at postgres.c:974
  12. #6 0x000000000067a179 in PostgresMain (argc=value optimized out>, argv=value optimized out>,
  13.     dbname=0x1f70fb0 "db1000", username=value optimized out>) at postgres.c:4016
  14. #7 0x00000000006283cb in BackendRun (argc=value optimized out>, argv=value optimized out>) at postmaster.c:4123
  15. #8 BackendStartup (argc=value optimized out>, argv=value optimized out>) at postmaster.c:3797
  16. #9 ServerLoop (argc=value optimized out>, argv=value optimized out>) at postmaster.c:1576
  17. #10 PostmasterMain (argc=value optimized out>, argv=value optimized out>) at postmaster.c:1223
  18. #11 0x00000000005c0cd8 in main (argc=3, argv=0x1f6fa10) at main.c:227

5.最后

按目前PG对分区表的处理逻辑,确实处理不了很多分区的情况。(但是update的那2处内存分配应该是可以优化的)。

6.参考

http://blog.chinaunix.net/xmlrpc.php?r=blog/article&uid=20726500&id=4732253
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
1月前
|
JSON 关系型数据库 分布式数据库
PolarDB常见问题之PolarDB8.0.1使用冷热混合分区失败如何解决
PolarDB是阿里云推出的下一代关系型数据库,具有高性能、高可用性和弹性伸缩能力,适用于大规模数据处理场景。本汇总囊括了PolarDB使用中用户可能遭遇的一系列常见问题及解答,旨在为数据库管理员和开发者提供全面的问题指导,确保数据库平稳运行和优化使用体验。
|
6月前
|
关系型数据库 数据管理 Go
《PostgreSQL数据分区:原理与实战》
《PostgreSQL数据分区:原理与实战》
85 0
|
6月前
|
SQL 监控 关系型数据库
PostgreSQL普通表转换成分区表
如何使用pg_rewrite扩展将普遍表转换成分区表
276 0
|
2月前
|
监控 负载均衡 关系型数据库
MySQL技能完整学习列表13、MySQL高级特性——1、分区表(Partitioning)——2、复制(Replication)——3、集群(Clustering)
MySQL技能完整学习列表13、MySQL高级特性——1、分区表(Partitioning)——2、复制(Replication)——3、集群(Clustering)
53 0
|
9月前
|
传感器 关系型数据库 MySQL
php语句:MySQL指定分区表跨分区根据时间条件快速查询记录的封装函数
php语句:MySQL指定分区表跨分区根据时间条件快速查询记录的封装函数
98 0
|
6月前
|
存储 关系型数据库 MySQL
MySQL分区表详解
在我们日常处理海量数据的过程中,如何有效管理和优化数据库一直是一个既重要又具有挑战性的问题
55 0
|
8月前
|
存储 关系型数据库 MySQL
【MySQL】MySQL分区表详解
【MySQL】MySQL分区表详解
174 0
|
9月前
|
SQL Cloud Native 算法
PolarDB-X 分区建列类型变更
作为一款分布式数据库,变更表的列类型,无论变更的是否是分区键,都需要保证各个分片以及元数据的一致性,因此对于非分区键的列类型变更也不只是简简单单的下推执行就可以的,后续会有一篇文章做详细的阐述,本文主要阐述的是如何对分区键的列类型做变更。
|
9月前
|
存储 SQL 运维
PolarDB MySQL大表实践-分区表篇
背景:分区表到底是什么?分区作为传统企业级数据库的特性,早已经在很多大数据和数仓场景中得到广泛应用。基于维基百科的解释,分区是将逻辑数据库或其组成元素如表、表空间等划分为不同的独立部分。数据库分区通常是出于可管理性、性能或可用性的原因,或者是为了负载平衡。它在分布式数据库管理系统中很流行,其中每个分区可能分布在多个节点上,节点上的用户在分区上执行本地事务。这提高了具有涉及某些数据视图的常规事务的站
334 0
PolarDB MySQL大表实践-分区表篇
|
9月前
|
SQL 存储 分布式数据库
实践教程之PolarDB-X分区管理
PolarDB-X 为了方便用户体验,提供了免费的实验环境,您可以在实验环境里体验 PolarDB-X 的安装部署和各种内核特性。除了免费的实验,PolarDB-X 也提供免费的视频课程,手把手教你玩转 PolarDB-X 分布式数据库。本期实验将指导您如何进行PolarDB-X分区管理。