操作系统: RHEL5.8 x64
数据库 : Oracle 10.2.0.5.0
故障情况:
一台单机曙光PC服务器4块300G SAS盘,RAID5坏两块磁盘(服务器面板无故障提示,无人发现),造成RAID5磁盘阵列挂掉,操作系统当机,系统无法启动。经过数据恢复公司将磁盘数据恢复后,重新恢复数据文件,启动数据库时发现如下错误提示:
1. 数据库警告日志
tail -f alert_orcl.log
Database mounted in Exclusive Mode
Completed: ALTER DATABASE MOUNT
Sat Sep 16 02:32:44 CST 2017
ALTER DATABASE OPEN
Sat Sep 16 02:32:44 CST 2017
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Sat Sep 16 02:32:44 CST 2017
Started redo scan
Sat Sep 16 02:32:44 CST 2017
Errors in file /u01/app/oracle/admin/orcl/udump/orcl_ora_27990.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/orcl/redo03.log'
ORA-27046: file size is not a multiple of logical block size
Additional information: 1
Sat Sep 16 02:32:44 CST 2017
Aborting crash recovery due to error 313
Sat Sep 16 02:32:44 CST 2017
Errors in file /u01/app/oracle/admin/orcl/udump/orcl_ora_27990.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/orcl/redo03.log'
ORA-27046: file size is not a multiple of logical block size
Additional information: 1
ORA-313 signalled during: ALTER DATABASE OPEN...
2. 启动时提示
[oracle@tcdb ~]$ sqlplus / as sysdba;
SQL*Plus: Release 10.2.0.5.0 - Production on 6 02:28:33 2017
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
Connected to an idle instance.
SQL> startup;
ORACLE instance started.
Total System Global Area 1610612736 bytes
Fixed Size 2096632 bytes
Variable Size 553648648 bytes
Database Buffers 1040187392 bytes
Redo Buffers 14680064 bytes
Database mounted.
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/orcl/redo03.log'
ORA-27046: file size is not a multiple of logical block size
Additional information: 1
SQL>
根据情况是数据库非规档,redo文件损坏,故障时早上7点时间。
SQL> select open_mode from v$database;
OPEN_MODE
----------
MOUNTED
SQL> set pagesize 999;
SQL> set linesize 200;
SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';
Session altered.
SQL> select * from v$log;
GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIME
---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- -------------------
1 1 3406 52428800 1 NO INACTIVE 187001363 2017-08-20 07:02:30
3 1 3408 52428800 1 NO CURRENT 187084379 2017-08-21 07:03:24
2 1 3407 52428800 1 NO INACTIVE 187044837 2017-08-20 19:40:06
当前日志损坏
SQL>
SQL> create pfile from spfile;
File created.
在pfile中加入隐含参数:
*._allow_resetlogs_corruption=true
*._allow_error_simulation=true
SQL> startup mount pfile='/u01/app/oracle/product/10.2.0/db_1/dbs/initorcl.ora'
ORACLE instance started.
Total System Global Area 1610612736 bytes
Fixed Size 2096632 bytes
Variable Size 553648648 bytes
Database Buffers 1040187392 bytes
Redo Buffers 14680064 bytes
Database mounted.
再次尝试resetlogs打开数据库
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01139: RESETLOGS option only valid after an incomplete database recovery
SQL> recover database until cancel;
ORA-00279: change 187084379 generated at 08/21/2017 07:03:24 needed for thread 1
ORA-00289: suggestion : /u01/app/oracle/flash_recovery_area/ORCL/archivelog/2017_10_18/o1_mf_1_3408_%u_.arc
ORA-00280: change 187084379 for thread 1 is in sequence #3408
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot open archived log '/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2017_10_18/o1_mf_1_3408_%u_.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-00308: cannot open archived log '/u01/app/oracle/flash_recovery_area/ORCL/archivelog/2017_10_18/o1_mf_1_3408_%u_.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/orcl/system01.dbf'
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
alter日志报如下错误:
Wed Oct 18 03:06:38 CST 2017
Errors in file /u01/app/oracle/admin/orcl/udump/orcl_ora_18057.trc:
ORA-00600: internal error code, arguments: [2662], [0], [187086068], [0], [187092671], [8388617], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [187086066], [0], [187092671], [8388617], [], []
Wed Oct 18 03:06:39 CST 2017
Errors in file /u01/app/oracle/admin/orcl/udump/orcl_ora_18057.trc:
ORA-00600: internal error code, arguments: [2662], [0], [187086069], [0], [187092671], [8388617], [], []
ORA-00600: internal error code, arguments: [2662], [0], [187086068], [0], [187092671], [8388617], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [187086066], [0], [187092671], [8388617], [], []
Wed Oct 18 03:06:40 CST 2017
Errors in file /u01/app/oracle/admin/orcl/udump/orcl_ora_18057.trc:
ORA-00600: internal error code, arguments: [2662], [0], [187086069], [0], [187092671], [8388617], [], []
ORA-00600: internal error code, arguments: [2662], [0], [187086068], [0], [187092671], [8388617], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [187086066], [0], [187092671], [8388617], [], []
[2662]需要推进SCN,使SCN到一致的状态。
SQL> conn /as sysdba
Connected to an idle instance.
SQL> startup mount pfile='/u01/app/oracle/product/10.2.0/db_1/dbs/initorcl.ora';
ORACLE instance started.
Total System Global Area 167772160 bytes
Fixed Size 1272600 bytes
Variable Size 96470248 bytes
Database Buffers 67108864 bytes
Redo Buffers 2920448 bytes
Database mounted.
SQL> alter session set events '10015 trace name adjust_scn level 1';
Session altered.
SQL> alter database open;
Database altered.
注:之后要去掉两个参数,create spfile from pfile;
数据库正常启动,当前redo日志损坏,理论上会造成数据丢失,考虑到早7点,应用未有业务数据产生,观察应用以及测试功能正常,故未对数据库进行逻辑重建。