2016年4月13日接到一呼叫平台负责人告警,oracle 9.2.0.8数据库的rman备份出现异常,控制文件无法备份,报错信息如下:
RMAN> run{
2> allocate channel c1 type disk;
3> backup current controlfile format '/tmp/ora_contr.bak';
4> release channel c1;
5> }
using target database controlfile instead of recovery catalog
allocated channel: c1
channel c1: sid=411 devtype=DISK
Starting backup at 13-APR-16
channel c1: starting full datafile backupset
channel c1: specifying datafile(s) in backupset
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
cannot make a snapshot controlfile
released channel: c1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on c1 channel at 04/13/2016 10:37:24
ORA-00230: operation disallowed: snapshot controlfile enqueue unavailable
查阅网上介绍rman备份遇到ORA00230有2个原因,一是9.2.0.8的数据库的rman配置了磁带库备份,在备份时磁带库故障响应超时导致,二是数据库在之前的rman备份过程中被异常中断,残留有上次rman的备份进程。
经过核实,当前数据库确实是9.2.0.8,操作系统是AIX6.1,为了排除磁带库故障,特地使用磁盘备份进行测试,测试结果如上rman备份失败报错信息,因此排除磁带库故障的原因;接下来是RMAN备份异常中断导致控制文件无法备份的处理过程:
--登录数据库查询残留的RMAN备份进程
SQL*Plus: Release 9.2.0.8.0 - Production on Wed Apr 13 10:17:00 2016
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> SELECT s.SID, USERNAME AS "User", PROGRAM, MODULE,
ACTION, LOGON_TIME "Logon"
FROM V$SESSION s, V$ENQUEUE_LOCK l
WHERE l.SID = s.SID
AND l.TYPE = 'CF'
AND l.ID1 = 0
AND l.ID2 = 2; --网上给的方法,没有查出结果,看来网上的方法不总是那么可靠,╮(╯▽╰)╭
no rows selected
--修改查询方法,查出了RMAN残留备份进程
SQL> select s.sid, username, program, module, action, logon_time, l.*
from v$session s, v$enqueue_lock l
where l.sid = s.sid
and l.type = 'CF';
SID USERNAME PROGRAM MODULE ACTION LOGON_TIM ADDR KADDR TY ID1 ID2 LMODE REQUEST CTIME BLOCK
---------- --------------- ------------------------------------- ------------------------------------------------ --------------------------- ------------- ---------------------------- ---- -- ----------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
377 SYS rman@cncora2 (TNS V1-V3) backup full datafile: ORA_SBT_TAPE_1 0000014 STARTED 13-APR-16 07000000C50A59B8 07000000C50A59D8 CF 0 2
4 0 1217 2
--查出sid=377的操作系统进程号
SQL> select 'kill -9 '||spid from v$process where addr = (select paddr from v$session where sid=&sid);
Enter value for sid: 377
old 1: select 'kill -9 '||spid from v$process where addr = (select paddr from v$session where sid=&sid)
new 1: select 'kill -9 '||spid from v$process where addr = (select paddr from v$session where sid=377)
'KILL-9'||SPID
--------------------------------------------
kill -9 2322660
RMAN> run{
2> allocate channel c1 type disk;
3> backup current controlfile format '/tmp/ora_contr.bak';
4> release channel c1;
5> }
using target database controlfile instead of recovery catalog
allocated channel: c1
channel c1: sid=411 devtype=DISK
Starting backup at 13-APR-16
channel c1: starting full datafile backupset
channel c1: specifying datafile(s) in backupset
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
waiting for snapshot controlfile enqueue
cannot make a snapshot controlfile
released channel: c1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on c1 channel at 04/13/2016 10:37:24
ORA-00230: operation disallowed: snapshot controlfile enqueue unavailable
查阅网上介绍rman备份遇到ORA00230有2个原因,一是9.2.0.8的数据库的rman配置了磁带库备份,在备份时磁带库故障响应超时导致,二是数据库在之前的rman备份过程中被异常中断,残留有上次rman的备份进程。
经过核实,当前数据库确实是9.2.0.8,操作系统是AIX6.1,为了排除磁带库故障,特地使用磁盘备份进行测试,测试结果如上rman备份失败报错信息,因此排除磁带库故障的原因;接下来是RMAN备份异常中断导致控制文件无法备份的处理过程:
--登录数据库查询残留的RMAN备份进程
SQL*Plus: Release 9.2.0.8.0 - Production on Wed Apr 13 10:17:00 2016
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> SELECT s.SID, USERNAME AS "User", PROGRAM, MODULE,
ACTION, LOGON_TIME "Logon"
FROM V$SESSION s, V$ENQUEUE_LOCK l
WHERE l.SID = s.SID
AND l.TYPE = 'CF'
AND l.ID1 = 0
AND l.ID2 = 2; --网上给的方法,没有查出结果,看来网上的方法不总是那么可靠,╮(╯▽╰)╭
no rows selected
--修改查询方法,查出了RMAN残留备份进程
SQL> select s.sid, username, program, module, action, logon_time, l.*
from v$session s, v$enqueue_lock l
where l.sid = s.sid
and l.type = 'CF';
SID USERNAME PROGRAM MODULE ACTION LOGON_TIM ADDR KADDR TY ID1 ID2 LMODE REQUEST CTIME BLOCK
---------- --------------- ------------------------------------- ------------------------------------------------ --------------------------- ------------- ---------------------------- ---- -- ----------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
377 SYS rman@cncora2 (TNS V1-V3) backup full datafile: ORA_SBT_TAPE_1 0000014 STARTED 13-APR-16 07000000C50A59B8 07000000C50A59D8 CF 0 2
4 0 1217 2
--查出sid=377的操作系统进程号
SQL> select 'kill -9 '||spid from v$process where addr = (select paddr from v$session where sid=&sid);
Enter value for sid: 377
old 1: select 'kill -9 '||spid from v$process where addr = (select paddr from v$session where sid=&sid)
new 1: select 'kill -9 '||spid from v$process where addr = (select paddr from v$session where sid=377)
'KILL-9'||SPID
--------------------------------------------
kill -9 2322660
--查看2322660进程是否是数据库核心进程(不会是)
[cncora2]$ps -ef|grep 2322660
oracle 2322660 1 0 10:26:02 - 0:01 oracleora922 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 2404512 2359338 0 10:53:40 pts/4 0:00 grep 2322660
--杀掉2322660
[cncora2]$kill -9 2322660
--核实2322660是否被杀掉
[cncora2]$ps -ef|grep 2322660
oracle 2322674 2359338 0 10:54:45 pts/4 0:00 grep 2322660
[cncora2]$sqlplus /nolog
SQL*Plus: Release 9.2.0.8.0 - Production on Wed Apr 13 10:54:53 2016
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> select s.sid,username,program,module,action,logon_time,l.*
2 from v$session s,v$enqueue_lock l
3 where l.sid=s.sid
4 and l.type='CF';
no rows selected
SQL> quit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning and Real Application Clusters options
--测试控制文件是否可以备份
[cncora2]$rman target /
Recovery Manager: Release 9.2.0.8.0 - 64bit Production
Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.
connected to target database: ORA92 (DBID=1953009355)
RMAN> run{
2> allocate channel c1 type disk;
3> backup current controlfile format '/tmp/ora_ctl.bak';
4> release channel c1;
5> }
using target database controlfile instead of recovery catalog
allocated channel: c1
channel c1: sid=404 devtype=DISK
Starting backup at 13-APR-16
channel c1: starting full datafile backupset
channel c1: specifying datafile(s) in backupset
including current controlfile in backupset
channel c1: starting piece 1 at 13-APR-16
channel c1: finished piece 1 at 13-APR-16
piece handle=/tmp/ora_ctl.bak comment=NONE
channel c1: backup set complete, elapsed time: 00:00:04
Finished backup at 13-APR-16
Starting Control File and SPFILE Autobackup at 13-APR-16
piece handle=/home/oracle/app/oracle/product/9.2.0.8.0/dbs/c-1953009355-20160413-00 comment=NONE
Finished Control File and SPFILE Autobackup at 13-APR-16
released channel: c1
RMAN> quit
Recovery Manager complete.
[cncora2]$cd /tmp
[cncora2]$ls -l ora*
-rw-r----- 1 oracle dba 9224192 Apr 13 11:00 ora_ctl.bak
-rwxr-xr-x 1 root system 677 Dec 07 2007 orainstRoot.sh
[cncora2]$ps -ef|grep 2322660
oracle 2322660 1 0 10:26:02 - 0:01 oracleora922 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 2404512 2359338 0 10:53:40 pts/4 0:00 grep 2322660
--杀掉2322660
[cncora2]$kill -9 2322660
--核实2322660是否被杀掉
[cncora2]$ps -ef|grep 2322660
oracle 2322674 2359338 0 10:54:45 pts/4 0:00 grep 2322660
[cncora2]$sqlplus /nolog
SQL*Plus: Release 9.2.0.8.0 - Production on Wed Apr 13 10:54:53 2016
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> select s.sid,username,program,module,action,logon_time,l.*
2 from v$session s,v$enqueue_lock l
3 where l.sid=s.sid
4 and l.type='CF';
no rows selected
SQL> quit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning and Real Application Clusters options
--测试控制文件是否可以备份
[cncora2]$rman target /
Recovery Manager: Release 9.2.0.8.0 - 64bit Production
Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.
connected to target database: ORA92 (DBID=1953009355)
RMAN> run{
2> allocate channel c1 type disk;
3> backup current controlfile format '/tmp/ora_ctl.bak';
4> release channel c1;
5> }
using target database controlfile instead of recovery catalog
allocated channel: c1
channel c1: sid=404 devtype=DISK
Starting backup at 13-APR-16
channel c1: starting full datafile backupset
channel c1: specifying datafile(s) in backupset
including current controlfile in backupset
channel c1: starting piece 1 at 13-APR-16
channel c1: finished piece 1 at 13-APR-16
piece handle=/tmp/ora_ctl.bak comment=NONE
channel c1: backup set complete, elapsed time: 00:00:04
Finished backup at 13-APR-16
Starting Control File and SPFILE Autobackup at 13-APR-16
piece handle=/home/oracle/app/oracle/product/9.2.0.8.0/dbs/c-1953009355-20160413-00 comment=NONE
Finished Control File and SPFILE Autobackup at 13-APR-16
released channel: c1
RMAN> quit
Recovery Manager complete.
[cncora2]$cd /tmp
[cncora2]$ls -l ora*
-rw-r----- 1 oracle dba 9224192 Apr 13 11:00 ora_ctl.bak
-rwxr-xr-x 1 root system 677 Dec 07 2007 orainstRoot.sh
到此,故障处理完成。