并发管理器的工作原理 和 cmclean.sql的潜在危险-阿里云开发者社区

开发者社区> 长烟慢慢> 正文

并发管理器的工作原理 和 cmclean.sql的潜在危险

简介: DBAs usually use the script to clean up stale data from concurrent processing tables (FND_CONCURRENT_%) after incidents like a crash of the database or concurrent processing node.
+关注继续查看

DBAs usually use the script to clean up stale data from concurrent processing tables (FND_CONCURRENT_%) after incidents like a crash of the database or concurrent processing node. This script sets correct completion phase and status codes for terminated concurrent requests and sets correct control codes for terminated concurrent manager processes. Despite the assuring “Non Destructive” claim in the title of the MOS Article there is a possibility to lose concurrent request schedules when cmclean.sql is executed.

First of all it’s important to understand how scheduled concurrent requests are executed and resubmitted. A simplified process of the execution is:

  1. Concurrent manager process (e.g. FNDLIBR in case of Standard Manager) queries the FND_CONCURRENT_REQUESTS table for pending requests.
  2. When a pending request is found, the manager process updates the PHASE_CODE=R (Running) and STATUS_CODE=R (Running).
  3. The next step is to start the executable of the concurrent program. If it’s a PL/SQL procedure – FNDLIBR  connects to the DB and executes the PL/SQL code, if it’s a java program – FNDLIBR starts up a java process to execute the java class, etc.
  4. FNDLIBR catches the exit codes from the executable of the concurrent program and updates the statuses in FND_CONCURRENT_REQUESTS accordingly – PHASE_CODE=C (Completed) and STATUS_CODE = C (Normal), G (Warning) or E (Error).
  5. FNDLIBR checks if the concurrent request has a schedule and needs to be resubmitted. If yes – it resubmits a new concurrent request with the same parameters.

But what happens if the FNDLIBR process crashes, terminates or gets killed while it’s running a concurrent request? Who takes care of the statuses in FND_CONCURRENT_REQUESTS table and how the request is resubmitted if the concurrent manager process is not there anymore?

It appears the Internal Concurrent Manager (ICM) takes care of these tasks. It checks the running requests periodically (every two minutes by default) and if it finds any that are missing the concurrent manager process and the DB session, it updates the statuses for the concurrent request and also resubmits it if it has a schedule. This action is followed by a log entry in the ICM log file:

1
2
3
4
5
6
7
8
9
10
                   Process monitor session started : 17-JUL-2013 04:24:24
 
Found running request 5829148 attached to dead manager process.
Setting request status to completed.
 
Found dead process: spid=(15160), cpid=(2032540), ORA pid=(35), manager=(0/0)
 
Starting STANDARD Concurrent Manager               : 17-JUL-2013 04:24:25
 
                     Process monitor session ended : 17-JUL-2013 04:24:25

Interesting to note, if the Internal Concurrent Manager is terminated at the same time with the manager process and is restarted later by the reviver process or by running “adcmctl.sh start” manually, the ICM performs the same check of running requests as part of the startup sequence, but this time it restarts the request instead of terminating and resubmitting it. The log of the ICM contains the following lines:

1
2
Found running request 5829146 attached to dead manager process.
Attempting to restart request.

The concurrent request is started again with exactly the same request_id as the previous time it was terminated, and the log file of the request will contain information from 2 executions – the 1st which didn’t complete and then the 2nd which probably completed. I think this scenario is very confusing and instead of restarting the request it should better be terminated and a new one should be submitted.

Let’s get back to the problem with cmclean.sql! The worst thing that can be done is running cmclean.sqlafter the crash of the concurrent processing node before starting up the concurrent managers. Why? Because cmclean.sql cleans up data in FND_CONCURRENT_REQUESTS by executing one simple update statement to change the phase and status of any “Running” or “Terminating” request to “Completed/Error”:

1
2
3
UPDATE fnd_concurrent_requests
SET phase_code = 'C', status_code = 'E'
WHERE status_code ='T' OR phase_code = 'R';

Cmclean.sql does not resubmit the request if it has a schedule. Execute it and you risk to lose some scheduled programs without any warning.

Similarly – never run cmclean.sql if you stopped the concurrent managers using “adcmctl.sh abort” or “kill -9” on concurrent manager processes to speed up the shutdown procedure. There’s the same risk to lose some scheduled requests.

Despite the risks, cmclean.sql is still a useful tool in case concurrent managers don’t come up after a failure or there are some stale data that is otherwise not cleaned up. But please, be careful when you run it! Check closely the list of requests reported in the following section of the outputs from cmclean.sql, because these requests have to be resubmitted manually if they had schedules.

1
2
3
4
5
6
7
8
9
-- Updating any Running or Terminating requests to Completed/Error
 
Request ID Phase  Status
---------- ------ ------
6607       R      W
6700       R      W
893534056  R      R
 
3 rows updated.

“Concurrent Manager Recovery” wizard is even worse! (Added on Jul 21, 2013)

After posting this article I started thinking about whether the “Concurrent Manager Recovery” Wizard available from Oracle Applications Manager in e-Business Suite was any better then cmclean.sql or not. As I didn’t have much experience with it I decided to give it a try. This is what I did:

  1. I scheduled 2 concurrent programs (“CP Java Regression Test” and “CP PLSQL Regression Test”) to restart in 1 minute after the previous execution completes. These are simple test concurrent programs which sleep for some time and then complete.
  2. I made sure both programs were running and terminated all concurrent manager process and DB sessions for these concurrent programs.
  3. The termination of the processes and sessions left the rows in FND_CONCURRENT_REQUESTS with PHASE_CODE=R and STATUS_CODE=R
  4. I executed the “Concurrent Manager Recovery” wizard which fixed the status codes of the concurrent manager processes, but didn’t touch the statuses of the concurrent requests – I thought this was a good thing (I expected the ICM to clean up the statuses and resubmit the requests at its startup phase)
  5. I started up the concurrent managers, but ICM didn’t clean up the 2 stale records in FND_CONCURRENT_REQUESTS table. The 2 requests appeared as they would be running, while in fact they didn’t have any OS processes or DB sessions.

I didn’t have much time to look into the details, but it looks like the ICM is only cleaning up requests attached to dead managers (“Active” status in the FND_CONCURRENT_PROCESSES table and no OS processes running). Here, the Wizard updated the statuses of the manager processes as if they completed normally, so the ICM couldn’t identify them as being “dead”.
This actually means that the “Concurrent Manager Recovery” wizard can cause serious issues too – it doesn’t clear up the concurrent_request statuses and it prevents ICM from doing it too, so once we start up the system the terminated requests appear as if they were running. And because of this, the Conflict Resolution Manager might prevent execution of some other programs with the incompatibility rules against the terminated requests. You will need to stop the managers and run cmclean.sql to fix the statuses (and loose the schedules) to get out of this situation.

So what should we do to clean up the concurrent processing tables after crashes or cloning? (Added on Jul 21, 2013)

It appears to me that no reliable way exists to clean up the tables properly. The cmclean.sql can remove some schedules without warning. The “Concurrent Manager Recovery” wizard may leave some requests in the running state even if they were terminated.
I’m going to open a SR for Oracle to request a proper solution, but meanwhile I’d suggest to use the cmclean.sql. However, make sure to check its outputs carefully and reschedule any requests which got cleaned up (as described above).


P.S. The description of the behavior of ICM in this blog post is a result of investigation performed on R12.1.3. I believe it behaves the same way in R12.1 and probably even in R12.0 and 11i, but I didn’t check. MOS Article ID 134007.1 which contains the cmclean.sql script is valid for Applications versions 10.7 to 12.1.3 – be careful when using it independently from the version of your e-Business Suite installation.







版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
PostgreSQL 10.1 手册_部分 III. 服务器管理_第 29 章 监控磁盘使用
第 29 章 监控磁盘使用 目录 29.1. 判断磁盘用量 29.2. 磁盘满失败 本章讨论如何监控PostgreSQL数据库系统的磁盘使用情况。 本文转自PostgreSQL中文社区,原文链接:第 29 章 监控磁盘使用
685 0
PostgreSQL 10.1 手册_部分 III. 服务器管理_第 28 章 监控数据库活动_28.3. 查看锁
28.3. 查看锁 监控数据库活动的另外一个有用的工具是pg_locks系统表。这样就允许数据库管理员查看在锁管理器里面未解决的锁的信息。例如,这个功能可以被用于: 查看当前所有未解决的锁、在一个特定数据库中的关系上所有的锁、在一个特定关系上所有的锁,或者由一个特定PostgreSQL会话持有的所有的锁。
949 0
读写分离提高 SQL Server 并发性
原文:读写分离提高 SQL Server 并发性 转自:http://www.canway.net/Lists/CanwayOriginalArticels/DispForm.aspx?ID=476 在一些大型的网站或者应用中,单台的SQL Server 服务器可能难以支撑非常大的访问压力。
723 0
使用IntelliJ IDEA开发SpringMVC网站(五)博客文章管理
原文:使用IntelliJ IDEA开发SpringMVC网站(五)博客文章管理 摘要 通过对博客文章的管理,实现外键操作。 目录[-] 八、博客文章管理 1、查看文章 2、添加博客        3、查看博文详情 4、修改博客内容 5、删除博客文章 九、尾声         1、如何部署 2、进一步的学习 3、ENDING 转载请注明出处:Gaussic(一个致力于AI研究却不得不兼顾项目的研究生) 。
1054 0
PostgreSQL 10.1 手册_部分 III. 服务器管理_第 31 章 逻辑复制_31.4. 限制
31.4. 限制 逻辑复制目前有以下限制或缺少的功能。 这些可能会在未来的版本中解决。 不复制数据库模式和DDL命令。初始模式可以使用pg_dump --schema-only 手动复制。后续的模式更改需要手动保持同步。
1027 0
Sql Server用管理器建表后如何查看创建表的语句
右键建的那个表-->编写表脚本为-->CREATE到-->新建编辑器查看窗口,就会出现新建表的SQL语句.
1536 0
PostgreSQL 10.1 手册_部分 III. 服务器管理_第 31 章 逻辑复制_31.5. 架构
31.5. 架构 31.5.1. 初始快照 逻辑复制首先复制发布者数据库上的数据快照。一旦完成, 发布者的变化就会实时发送给订阅者。订阅者按照发布者提交的顺序应用数据, 以确保任何单个订阅中的发布的事务一致性。
1091 0
PostgreSQL 10.1 手册_部分 III. 服务器管理_第 30 章 可靠性和预写式日志
第 30 章 可靠性和预写式日志 目录 30.1. 可靠性 30.2. 预写式日志(WAL) 30.3. 异步提交 30.4. WAL配置 30.5. WAL内部 本章解释预写式日志如何用于获得有效的、可靠的操作。
861 0
+关注
长烟慢慢
系统架构师
806
文章
0
问答
文章排行榜
最热
最新
相关电子书
更多
文娱运维技术
立即下载
《SaaS模式云原生数据仓库应用场景实践》
立即下载
《看见新力量:二》电子书
立即下载