DRM hang causes frequent RAC Instances Reconfiguration (Doc ID 1528362.1)-阿里云开发者社区

DRM hang causes frequent RAC Instances Reconfiguration (Doc ID 1528362.1)

2013-12-09 1735

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： DRM hang causes frequent RAC Instances Reconfiguration (Doc ID 1528362.

DRM hang causes frequent RAC Instances Reconfiguration (Doc ID 1528362.1)

To Bottom

Modified:27-Mar-2013

Type:PROBLEM

In this Document

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.1.0.7 to 11.2.0.3 [Release 11.1 to 11.2]
Information in this document applies to any platform.

SYMPTOMS

- RAC Instances freezes during DRM for 100 secs or more.

- DB Alert log shows that all RAC instances undergo reconfiguration at the same time, but there are no instance crashes

Node 1 DB Alert Log	Node 2 DB Alert Log
Sat Jul 14 14:17:04 2012 Reconfiguration started (old inc 70, new inc 72) List of instances: 1 2 (myinst: 1) Global Resource Directory frozen Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Sat Jul 14 14:17:04 2012 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Sat Jul 14 14:17:04 2012 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Sat Jul 14 14:17:13 2012 minact-scn: Master returning as live inst:2 has inc# mismatch instinc:70 cur:72 errcnt:0	Sat Jul 14 14:17:04 2012 Reconfiguration started (old inc 70, new inc 72) List of instances: 1 2 (myinst: 2) Global Resource Directory frozen Communication channels reestablished Sat Jul 14 14:17:04 2012 * domain 0 valid = 1 according to instance 1 Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Sat Jul 14 14:17:04 2012 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Sat Jul 14 14:17:04 2012 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted Sat Jul 14 14:18:03 2012 Submitted all GCS remote-cache requests

Node 1 DB Alert Log

Node 2 DB Alert Log

Sat Jul 14 14:17:04 2012
Reconfiguration started (old inc 70, new inc 72)
List of instances:
1 2 (myinst: 1)
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Sat Jul 14 14:17:04 2012
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Jul 14 14:17:04 2012
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Sat Jul 14 14:17:13 2012
minact-scn: Master returning as live inst:2 has inc# mismatch instinc:70 cur:72 errcnt:0

Sat Jul 14 14:17:04 2012
Reconfiguration started (old inc 70, new inc 72)
List of instances:
1 2 (myinst: 2)
Global Resource Directory frozen
Communication channels reestablished
Sat Jul 14 14:17:04 2012
* domain 0 valid = 1 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Sat Jul 14 14:17:04 2012
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Sat Jul 14 14:17:04 2012
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Sat Jul 14 14:18:03 2012
Submitted all GCS remote-cache requests

- Lmon trace shows that DRM quiesce step hangs:

*** 2012-07-14 14:14:51.187
CGS recovery timeout = 85 sec
Begin DRM(231) (swin 1)
* drm quiesce

*** 2012-07-14 14:17:03.752
* Request pseudo reconfig due to drm quiesce hang
2012-07-14 14:17:03.752735 : kjfspseudorcfg: requested with reason 5(DRM Quiesce step stall)

*** 2012-07-14 14:17:03.766
kjxgmrcfg: Reconfiguration started, type 6
CGS/IMR TIMEOUTS:
CSS recovery timeout = 31 sec (Total CSS waittime = 65)
IMR Reconfig timeout = 75 sec
CGS rcfg timeout = 85 sec
kjxgmcs: Setting state to 70 0.

- AWR Top waits are "gcs resource directory to be unfrozen" & "gc remaster"

CHANGES

Large Buffer Cache

CAUSE

This is caused by bug:
Bug 12879027 - Lmon trace file shows that Pseudo reconfigurations triggered by the DRM are hanging. DRM quiesce is timing out..

DRM has a number of steps. During the DRM quiesce step all ongoing block transfers for remastering are completed.
In this case, during the DRM quiesce step a hang occured due to an internal function hitting a timeout.
This is a bug condition that happens when the buffer cache is very large.

This hang then triggers a pseudoreconfiguration to prevent the instance from being killed by another instance.
This is the reason for the instance undergoing a reconfiguration without restarting.

SOLUTION

Apply the fix for bug 12879027.

This issue is fixed in the following DB PSU patches:

Patch 13923374 - 11.2.0.3.3 DB Patch Set Update (PSU)
Patch 13923804 - 11.2.0.2.7 DB Patch Set Update (PSU)

Also bundled in the corresponding GI PSU

REFERENCES

NOTE:756671.1 - Oracle Recommended Patches -- Oracle Database
NOTE:390483.1 - DRM - Dynamic Resource management
BUG:12879027 - LMON PROCESS CAN GET STUCK IN DRM QUIESCE STEP TRIGGERING PSEUDO RECONFIGURATION

Products

Oracle Database Products > Oracle Database > Oracle Database > Oracle Database - Enterprise Edition > Real Application Cluster > RAC Database/Instance Performance Issues

Keywords

BUFFER CACHE;

BUG;

HANGING;

LMON;

RAC;

RECONFIGURATION

DRM hang causes frequent RAC Instances Reconfiguration (Doc ID 1528362.1)

APPLIES TO:

SYMPTOMS

CHANGES

CAUSE

SOLUTION

REFERENCES

Related

Products

Keywords

热门文章

最新文章

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

DRM hang causes frequent RAC Instances Reconfiguration (Doc ID 1528362.1)

APPLIES TO:

SYMPTOMS

CHANGES

CAUSE

SOLUTION

REFERENCES

Related

Products

Keywords

热门文章

最新文章

相关电子书