前段时间有一套RAC系统偶然会出现负载异常升高的情况,后来查出来跟节点内部通信有关系,之后升级RAC系统的网卡驱动,其中一个节点重启,另一个节点通过rmmod,modprobe加载新的网卡驱动,导致混合连接,gc buffer busy频繁 :
SQL> select INST_ID,username,count(*) from gv$session group by inst_id,username order by INST_ID,username;
INST_ID USERNAME COUNT(*)
---------- ------------------------------ ----------
1 SKYCAC 206
1 SYS 7
1 28
2 EXP_MAN 1
2 SKYCAC 206
2 SKYCACB 30
2 SYS 6
2 28
with stats as (
select owner,object_name,statistic_name,value
from v$segment_statistics
where statistic_name = 'gc buffer busy'
order by value desc
)
select * from stats where rownum<=10
/
OWNER OBJECT_NAME STATISTIC_NAME VALUE
------------ -------------------------------- -------------------------------- ----------
SKYCAC IDX_CAC_PLAYER_INFO gc buffer busy 283355
SKYCAC IDX_CAC_PLAYER_INFO_NICK_NAME gc buffer busy 19832
SKYCAC CAC_PLAYER_PRIZE gc buffer busy 17370
SKYCAC CAC_MONITOR_PLAYER gc buffer busy 10278
SKYCAC CAC_GAME_ACCESS_RECORD gc buffer busy 8344
SKYCAC PK_CAC_PLAYER_PRIZE gc buffer busy 8278
SKYCAC CAC_ROOM gc buffer busy 4016
SKYCAC IDX_CAC_GAME_ACS_REC_TIME gc buffer busy 3300
SKYCAC CAC_PLAYER_INFO gc buffer busy 1198
SKYCAC IDX_CAC_PLAYER_INFO_SKY_ID gc buffer busy 592 592
处理掉2 SKYCAC 206这部分连接的话,将恢复正常.
类似案例 :
The new application, recently installed to run against a RAC database (3
nodes, 64 bit linux, Oracle 10.2.0.5) is making heavy use of advanced
queueing. My problem is that the queue tables are incessant source of
contention, suffering from all kinds of buffer busy waits, both local and
global. If I check V$SEGMENT_STATISTIC with the following query,
with stats as (
select owner,object_name,statistic_name,value
from v$segment_statistics
where statistic_name = 'gc buffer busy'
order by value desc
)
select * from stats where rownum<=10
/
The result looks like this:
OWNER OBJECT_NAME STATISTIC_NAME
VALUE
--------------- ------------------------------ --------------------
----------
SYS I_JOB_JOB gc buffer busy
30184683
SYS JOB$ gc buffer busy
10128719
ADBASE PK_PENDING_ALERTS gc buffer busy
7899852
SYS I_JOB_NEXT gc buffer busy
5302448
ADBASE PENDING_ALERTS gc buffer busy
5288135
LOCATIONSERVICE AQ$_MMSRES_MMSAGENT_TABLE_I gc buffer busy
1082715
LOCATIONSERVICE MMSRES_MMSAGENT_TABLE gc buffer busy
1055558
LOCATIONSERVICE SPEECH2TEXT_Q_TABLE gc buffer busy
622833
LOCATIONSERVICE TASKS gc buffer busy
358430
LOCATIONSERVICE DQV2MIN_STARTDATE_IDX gc buffer busy
256124
Now, everything that is not owned by SYS and is not index is a queue
table. The problem is systemic in nature, queue tables are by their very
nature the point of contention. What can be done to alleviate the
contention, short of restricting the queue to a single node only?
Every queue has retention time set to 0. Developers argue that setting
retry_delay to something >0 would be extremely detrimental to performance.