备库数据文件异常,物理DG如何恢复?

简介: 备库数据文件异常,物理DG如何恢复?

有的时候由于备库空间不足,在主库添加了数据文件后,导致备库数据文件的缺失,可能很久之后才发现,但是由于归档的缺失等其它原因而导致备库不能正常应用Redo日志。还有其它情况可能导致备库的数据文件不能正常ONLINE,在这种情况下,可以在主库上利用CONVERT命令备份一个数据文件然后拷贝到备库即可。若是备库归档文件比较全,则可以直接在备库创建数据文件后应用Redo日志即可,而不需要从主库拷贝数据文件。

恢复过程中的一些关键性的命令如下所示:

CONVERT DATAFILE '+DATA1/oralhrs/datafile/tbs101.262.923139373' FORMAT '/tmp/tbs101.dbf_bk';--主库备份相关文件
CONVERT DATAFILE '/tmp/tbs101.dbf_bk' FORMAT '+DATA1';--备库修改从主库拷贝过来的文件为ASM格式
ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='MANUAL' SID='*'; --备库修改文件管理模式为手动
ALTER DATABASE CREATE DATAFILE 64 AS '+DATA1';--备库若数据文件丢失可以先创建一个数据文件
ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215'; --重命名刚新建的数据文件为从主库拷贝过来的数据文件
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;--启用Redo恢复

接下来演示整个恢复过程。

首先查看备库的文件情况,发现64号文件处于OFFLINE状态。

SYS@oraLHRDG2> SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS  FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64);
     FILE# NAME                                                 REC CHECKPOINT_CHANGE# STATUS
---------- ---------------------------------------------------- --- ------------------ -------
         1 +DATA1/oralhrsg/datafile/system.358.869055401                    1.5760E+10 ONLINE
         2 +DATA1/oralhrsg/datafile/sysaux.354.869047985                    1.5760E+10 ONLINE
        64 +DATA1/oralhrsg/datafile/tbs101.382.875442343                    1764555149 OFFLINE
SYS@oraLHRDG2> RECOVER DATAFILE 64;  
ORA-00283: recovery session canceled due to errors
ORA-01153: an incompatible media recovery is active
SYS@oraLHRDG2> RECOVER MANAGED STANDBY DATABASE CANCEL;
Media recovery complete.
SYS@oraLHRDG2>  RECOVER DATAFILE 64;  
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
SYS@oraLHRDG2> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
Database altered.
SYS@oraLHRDG2> ALTER DATABASE DATAFILE 64 ONLINE;
alter database datafile 64 online
*
ERROR at line 1:
ORA-01113: file 64 needs media recovery
ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.382.875442343'

虽然可以开启实时应用进程,但是64号文件依然不能ONLINE,因为现在的系统SCN号和64号文件头的SCN号相差很大了,归档日志必然不存在了,所以使用日志来恢复文件的方法自然不可行了。那么,接下来在主库用CONVERT命令备份64号文件:

[ZFLHRSDB1:oracle]:/oracle>rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Wed Sep 21 14:49:56 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: ORAIPPS (DBID=1344172889)
RMAN> CONVERT DATAFILE '+DATA1/oralhrs/datafile/tbs101.262.923139373' FORMAT '/tmp/tbs101.dbf_bk';
Starting conversion at target at 2016-09-21 14:51:16
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile conversion
input file name=+DATA1/oralhrs/datafile/tbs101.262.923139373
converted datafile=/tmp/tbs101.dbf_bk
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:03
Finished conversion at target at 2016-09-21 14:51:19

将备份的文件拷贝到备库:

[ZFLHRSDB1:oracle]:/tmp>scp /tmp/tbs101.dbf_bk oracle@192.68.155.16:/tmp/tbs101.dbf_bk
The authenticity of host '192.68.155.16 (192.68.155.16)' can't be established.
RSA key fingerprint is 7b:d6:ba:ca:b3:71:b5:0b:bf:14:f4:e4:18:5f:51:45.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.68.155.16' (RSA) to the list of known hosts.
tbs101.dbf_bk                                                    100%  100MB  50.0MB/s   00:02    

在备库上转换文件为ASM格式:

RMAN> CONVERT DATAFILE '/tmp/tbs101.dbf_bk' FORMAT '+DATA1';
Starting conversion at target at 2016-09-21 14:53:33
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1542 instance=oraLHRDG2 device type=DISK
channel ORA_DISK_1: starting datafile conversion
input file name=/tmp/tbs101.dbf_bk
converted datafile=+DATA1/oralhrsg/datafile/tbs101.382.923151215
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01
Finished conversion at target at 2016-09-21 14:53:36

备库上进行重命名操作,若是备库上64号文件被删除了,则需要先重建64号文件:

SYS@oraLHRDG2> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='MANUAL' SID='*';
System altered.
SYS@oraLHRDG2> ALTER DATABASE CREATE DATAFILE 64 AS '+DATA1';
Database altered.
SYS@oraLHRDG2> SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS  FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64);
     FILE# NAME                                             REC CHECKPOINT_CHANGE# STATUS
---------- ------------------------------------------------ --- ------------------ -------
         1 +DATA1/oralhrsg/datafile/system.358.869055401                1.5761E+10 ONLINE
         2 +DATA1/oralhrsg/datafile/sysaux.354.869047985                1.5761E+10 ONLINE
        64 +DATA1/oralhrsg/datafile/tbs101.483.923151901                1.5761E+10 OFFLINE
SYS@oraLHRDG2> ALTER DATABASE DATAFILE 64 ONLINE;
ALTER DATABASE DATAFILE 64 ONLINE
*
ERROR at line 1:
ORA-01113: file 64 needs media recovery
ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.483.923151901'

可以看到已经有64号文件了,下边进行重命名,修改为从主库拷贝过来的64号文件:

SYS@oraLHRDG2> ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215';
ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215'
*
ERROR at line 1:
ORA-01511: error in renaming log/data files
ORA-01121: cannot rename database file 64 - file is in use or recovery
ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.483.923151901'
SYS@oraLHRDG2> ! oerr ora 01121
01121, 00000, "cannot rename database file %s - file is in use or recovery"
// *Cause:  Attempted to use ALTER DATABASE RENAME to rename a
//          datafile that is online in an open instance or is being recovered.
// *Action: Close database in all instances and end all recovery sessions.

该错误提示文件在使用,不能被重命名。由于该库是RAC库,需要先关闭DG,启动到MOUNT状态后再重命名:

[ZFLHRSDB4:root]:/>srvctl stop db -d oralhrsg
[ZFLHRSDB4:root]:/>srvctl start db -d oralhrsg -o mount

在数据库中重命名:

SYS@oraLHRDG2> conn / as sysdba
Connected.
SYS@oraLHRDG2> ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215';
Database altered.
SYS@oraLHRDG2> ALTER DATABASE DATAFILE 64 ONLINE;
Database altered.<<<<<<<<<---------数据文件可以ONLINE了
SYS@oraLHRDG2> COL NAME FOR A50
SYS@oraLHRDG2> COL CHECKPOINT_CHANGE# FOR 9999999999999
SYS@oraLHRDG2> SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS  FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64);
     FILE# NAME                                               REC CHECKPOINT_CHANGE# STATUS
---------- -------------------------------------------------- --- ------------------ -------
         1 +DATA1/oralhrsg/datafile/system.358.869055401                 15760776695 ONLINE
         2 +DATA1/oralhrsg/datafile/sysaux.354.869047985                 15760776695 ONLINE
        64 +DATA1/oralhrsg/datafile/tbs101.382.923151215                 15760492416 ONLINE
SYS@oraLHRDG2> ALTER DATABASE OPEN READ ONLY;
ALTER DATABASE OPEN READ ONLY
*
ERROR at line 1:
ORA-10458: standby database requires recovery
ORA-01194: file 64 needs more recovery to be consistent
ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.382.923151215'
 <<<<<<<<<------------  打开数据库依然报错,尝试手动恢复一下,看看需要哪些日志,因为64号文件已经是最新的了
SYS@oraLHRDG2> RECOVER DATABASE;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
SYS@oraLHRDG2> RECOVER STANDBY DATABASE USING BACKUP CONTROLFILE;
ORA-00279: change 15760492416 generated at 09/21/2016 11:38:54 needed for thread 1
ORA-00289: suggestion : /arch/1_12918_868895513.arc
ORA-00280: change 15760492416 for thread 1 is in sequence #12918
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 64 needs more recovery to be consistent
ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.382.923151215'
ORA-01112: media recovery not started
 <<<<<<<<<------------ 缺少12918日志,很欣慰,因为12918已经是最新的日志了,这里解决起来就很简单了,可以从主库拷贝12918日志到备库,但是这样太麻烦,可以开启备库的应用进程让其自动解决备库的GAP问题
SYS@oraLHRDG2> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
Database altered.

此时查看告警日志,很欣慰看到了12918日志过来了:

Wed Sep 21 15:24:33 2016
alter database recover managed standby database using current logfile disconnect from session
Attempt to start background Managed Standby Recovery process (oraLHRDG2)
Wed Sep 21 15:24:33 2016
MRP0 started with pid=44, OS id=12649040 
MRP0: Background Managed Standby Recovery process started (oraLHRDG2)
 started logmerger process
Wed Sep 21 15:24:39 2016
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 16 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Wed Sep 21 15:24:40 2016
Media Recovery Log /arch/1_12918_868895513.arc
Media Recovery Log /arch/2_12918_868895513.arc
Completed: alter database recover managed standby database using current logfile disconnect from session
Datafile 64 added to flashback set
Media Recovery Log /arch/2_12919_868895513.arc
Media Recovery Log /arch/1_12919_868895513.arc
Media Recovery Log /arch/2_12920_868895513.arc
Media Recovery Log /arch/1_12920_868895513.arc
Media Recovery Log /arch/2_12921_868895513.arc
Media Recovery Log /arch/1_12921_868895513.arc
Media Recovery Waiting for thread 2 sequence 12922 (in transit)
Recovery of Online Redo Log: Thread 2 Group 12 Seq 12922 Reading mem 0
  Mem# 0: +DATA1/oralhrsg/onlinelog/group_12.353.869055809
Media Recovery Waiting for thread 1 sequence 12922 (in transit)
Recovery of Online Redo Log: Thread 1 Group 8 Seq 12922 Reading mem 0
  Mem# 0: +DATA1/oralhrsg/onlinelog/group_8.344.869055791

最后重启备库的2个节点:

[ZFLHRSDB4:root]:/>srvctl stop db -d oralhrsg
[ZFLHRSDB4:root]:/>srvctl start db -d oralhrsg
[ZFLHRSDB4:root]:/>srvctl status db -d oralhrsg
Instance oraLHRDG1 is running on node zflhrsdb3
Instance oraLHRDG2 is running on node zflhrsdb4

而数据库中64号文件已经正常了:

SYS@oraLHRDG2>  SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS  FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64);
     FILE# NAME                                               REC CHECKPOINT_CHANGE# STATUS
---------- -------------------------------------------------- --- ------------------ -------
         1 +DATA1/oralhrsg/datafile/system.358.869055401                 15760815694 ONLINE
         2 +DATA1/oralhrsg/datafile/sysaux.354.869047985                 15760815694 ONLINE
        64 +DATA1/oralhrsg/datafile/tbs101.382.923151215                 15760815694 ONLINE
SYS@oraLHRDG2> show parameter standby
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
standby_archive_dest                 string      ?/dbs/arch
standby_file_management              string      MANUAL
SYS@oraLHRDG2> ALTER SYSTEM SET standby_file_management='AUTO' SID='*';====>>>>> 别忘记将该参数修改回来
System altered.

最后不要忘记将STANDBY_FILE_MANAGEMENT参数修改为AUTO。

相关实践学习
日志服务之使用Nginx模式采集日志
本文介绍如何通过日志服务控制台创建Nginx模式的Logtail配置快速采集Nginx日志并进行多维度分析。
目录
相关文章
|
SQL 存储 数据库
在DG环境中,主库丢失归档,对主库进行基于SCN的增量备份来恢复物理DG环境
在DG环境中,主库丢失归档,对主库进行基于SCN的增量备份来恢复物理DG环境
383 0
|
SQL 监控 Oracle
Oracle实例恢复和介质恢复
Oracle恢复基础概述  一、恢复解决方案 错误类型及解决方案 错误分类 恢复解决方案 介质失败 如果是少量的块损坏,使用块介质恢复;如果是大量的块、数据文件、表空间的损坏,可能需要对损坏的数据文件或者表空间执行完全恢复;如果是归档Redo日志文件或者联机Redo日志文件的丢失,那么只需要不完全恢复方式。
1606 0
|
SQL 监控 关系型数据库
【故障处理】DG环境主库丢失归档情况下数据文件的恢复
【故障处理】DG环境主库丢失归档情况下数据文件的恢复 1  BLOG文档结构图     2  前言部分   2.
1328 0