Oracle坏块处理

简介:

oracle 坏块问题处理

1、说明

坏块问题是经常出现在数据库系统中的,如果没有合适的处理方法往往会导致坏块对象不可用或者数据丢失。本文从坏块产生开始说起,着重说明一但坏块产生后不同情况的处理方法。

2、坏块产生的原因:

硬件的I/O错误 
操作系统的I/O错误或缓冲问题 
内存或paging问题 
磁盘修复工具 
一个数据文件的一部分正在被覆盖 
Oracle试图访问一个未被格式化的系统块失败 
数据文件部分溢出 
Oracle或者操作系统的
bug

3、坏块的发现:

3.1、Alter日志报错

Tue Aug 17 10:48:07 2010

Corrupt Block Found

TSN = 7, TSNAME = BTEST

RFN 6, BLK = 839, rdba = 25166663

OBJN = 49205, BJD = 49205, BJECT= BOBJ, SUBOBJECT =

Segment wner= AN, Segment Type = Table Segment

其中RFN表示的是relative_fno,6号文件的839块,段类型为表段,是表bobj出现坏块。

3.2、查询报错

SQL> select count(*) from bobj;

select count(*) from bobj

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 6, block # 839)

ORA-01110: data file 6: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF'

3.3、分析表报错

SQL> analyze table bobj validate structure cascade;

analyze table bobj validate structure cascade

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 6, block # 839)

ORA-01110: data file 6: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF'

3.4、Rman备份报错

RMAN> backup tablespace btest;

Starting backup at 17-8月 -10

using channel ORA_DISK_1

channel ORA_DISK_1: starting full datafile backupset

channel ORA_DISK_1: specifying datafile(s) in backupset

input datafile fno=00006 name=F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF

channel ORA_DISK_1: starting piece 1 at 17-8月 -10

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on ORA_DISK_1 channel at 08/17/2010 11:03:09

ORA-19566: exceeded limit of 0 corrupt blocks for file F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF

3.5、Dbv检查报错

F:\oracle\product\10.1.0\oradata\db10>dbv file=BTEST.DBF blocksize=8192

DBVERIFY: Release 10.1.0.2.0 - Production on 星期二 8月 17 10:49:02 2010

Copyright (c) 1982, 2004, Oracle. All rights reserved.

DBVERIFY - Verification starting : FILE = BTEST.DBF

Page 839 is marked corrupt

Corrupt block relative dba: 0x01800347 (file 6, block 839)

Bad check value found during dbv:

Data in bad block:

type: 6 format: 2 rdba: 0x01800347

last change scn: 0x0000.0005246f seq: 0x1 flg: 0x04

spare1: 0x0 spare2: 0x0 spare3: 0x0

consistency value in tail: 0x246f0601

check value in block header: 0x50c2

computed block checksum: 0x2751

DBVERIFY - Verification complete

Total Pages Examined : 5376

Total Pages Processed (Data) : 5165

Total Pages Failing (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing (Index): 0

Total Pages Processed (Other): 9

Total Pages Processed (Seg) : 0

Total Pages Failing (Seg) : 0

Total Pages Empty : 201

Total Pages Marked Corrupt : 1

Total Pages Influx : 0

3.6、查询坏块发生的数据文件以及对应表空间

Select file_name,tablespace_name,file_id “AFN”,relative_fno “RFN”
From dba_data_files; 
Select file_name,tablespace_name,file_id, relative_fno“RFN”
From dba_temp_files; 

  

3.7、查询存在坏块的对象是什么:

SELECT tablespace_name, segment_type, owner,

segment_name, partition_name FROM

dba_extents WHERE file_id =v_file_id and v_block_id

between block_id AND block_id + blocks – 1

4、出现坏块的常见对象:

Sys用户下的对象 
回滚段 
临时段 
索引或者分区索引

5、问题的处理

5.1、无备份情况下的恢复

5.1.1、Sys用户下的对象,需要谨慎处理。

5.1.2、回滚段坏块采用类似ora-600(4000)的处理方法将回滚段删除即可。会造成事物失败,数据丢失。

5.1.3、索引:重建,期间会锁表对业务有影响,数据不会丢失。

5.1.4、表:

5.1.4.1、EVENT10231

SQL> select count(*)from bobj;

COUNT(*)

----------

376240

SQL> analyze table bobj validate structure cascade;

analyze table bobj validate structure cascade

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 6, block # 495)

ORA-01110: data file 6: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF'

SQL> alter session set events ='10231 trace name context forever,level 10';

Session altered.

SQL> select count(*) from bobj;

COUNT(*)

----------

376169

忽略了坏块进行读写,所以数据量上有变化,丢失了一部分数据。

5.1.4.2、ROWID RANGE SCAN

FUNCTION ROWID_CREATE RETURNS ROWID

Argument Name Type In/Out Default?

------------------------------ ----------------------- ------ --------

ROWID_TYPE NUMBER IN

OBJECT_NUMBER NUMBER IN

RELATIVE_FNO NUMBER IN

BLOCK_NUMBER NUMBER IN

ROW_NUMBER NUMBER IN

SQL> select * from v$database_block_corruption;

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO

---------- ---------- ---------- ------------------ ---------

6 839 1 0 CHECKSUM

SQL> SELECT dbms_rowid.rowid_create(1,49205,6,839,0) from DUAL;

DBMS_ROWID.ROWID_C

------------------

AAAMA1AAGAAAANHAAA

SQL> SELECT dbms_rowid.rowid_create(1,49205,6,840,0) from DUAL;

DBMS_ROWID.ROWID_C

------------------

AAAMA1AAGAAAANIAAA

SQL> create table bbobj tablespace btest as select * from bobj where 1=2;

Table created.

SQL> insert into bbobj select /*+ rowid(a) */ * from bobj where rowid<'AAAMA1AAG

AAAANHAAA';

60993 rows created.

SQL> insert into bbobj select /*+ rowid(a) */ * from bobj where rowid>='AAAMA1AAG

AAAANIAAA';

315108 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from bbobj;

COUNT(*)

----------

376101

从v$database_block_corruption中可以查处当前数据库中的坏块信息

通过调用dbms_rowid.rowid_create确认出坏块对应的rowid,重新创建表结构相同的表,并以rowid为条件跳过坏块将好数据存入到中间表中,损失一部分数据,仍然可以将坏块的影响忽略掉。

5.1.4.3、Dbms_repair

以sys用户执行

PROCEDURE SKIP_CORRUPT_BLOCKS

Argument Name Type In/Out Default?

------------------------------ ----------------------- ------ --------

SCHEMA_NAME VARCHAR2 IN

OBJECT_NAME VARCHAR2 IN

OBJECT_TYPE BINARY_INTEGER IN DEFAULT

FLAGS BINARY_INTEGER IN DEFAULT

SQL> Execute DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('AN','BOBJ')

PL/SQL procedure successfully completed.

SQL> conn an/an

Connected.

SQL> select count(*) from bobj;

COUNT(*)

----------

376169

试用sys用户调用dbms_repair中的skip_corrupt_blocks忽略对象中的坏块,同样会丢失坏块中的数据。包中还包含过程FIX_CORRUPT_BLOCKS,可以fix掉坏块。

5.2、有备份情况下的恢复

5.2.1、Blockrecover

RMAN> blockrecover datafile 5 block 425;

Starting blockrecover at 17-8 -10

using channel ORA_DISK_1

channel ORA_DISK_1: restoring block(s)

channel ORA_DISK_1: specifying block(s) to restore from backup set

restoring blocks of datafile 00005

channel ORA_DISK_1: restored block(s) from backup piece 1

piece handle=F:\ORACLE\PRODUCT\10.1.0\FLASH_RECOVERY_AREA\DB10\BACKUPSET\2010_08_17\O1_MF_NNNDF_TAG20100817T162508_66NKV5B8_.BKP tag=TAG20100817T162508

channel ORA_DISK_1: block restore complete

starting media recovery

media recovery complete

Finished blockrecover at 17-8 -10

SQL> select count(*) from an.atest;

COUNT(*)

----------

376240

5.2.2、Recover datafile

SQL> select count(*) from an.atest;

select count(*) from an.atest

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 5, block # 963)

ORA-01110: data file 5: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\ATEST.DBF'

RMAN> sql " alter database datafile 5 offline";

sql statement: alter database datafile 5 offline

RMAN> restore datafile 5;

Starting restore at 17-8 -10

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backupset restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

restoring datafile 00005 to F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\ATEST.DBF

channel ORA_DISK_1: restored backup piece 1

piece handle=F:\ORACLE\PRODUCT\10.1.0\FLASH_RECOVERY_AREA\DB10\BACKUPSET\2010_08_17\O1_MF_NNNDF_TAG20100817T162508_66NKV5B8_.BKP tag=TAG20100817T162508

channel ORA_DISK_1: restore complete

Finished restore at 17-8 -10

RMAN> recover datafile 5;

Starting recover at 17-8 -10

using channel ORA_DISK_1

starting media recovery

media recovery complete

Finished recover at 17-8 -10

RMAN> sql " alter database datafile 5 online";

sql statement: alter database datafile 5 online

SQL> select count(*) from an.atest;

COUNT(*)

----------

376240

5.3、Rman备份跳过坏块

SET MAXCORRUPT FOR DATAFILE filename TO n

例:

run {

allocate channel node1 type disk;

SET MAXCORRUPT FOR DATAFILE 8 TO 3

set limit channel node1 kbytes = 1800000;

backup as compressed backupset full database format '$BACKUPDIR/full_%d_%T_%s_%p' plus archivelog format '$BACKUPDIR/arch_%d_%T_%s_%p' delete all input;

backup current controlfile format '$BACKUPDIR/ctl_%d_%T_%s_%p' TAG "control.bak";

release channel node1;

}

 

 










本文转自东方之子736651CTO博客,原文链接:http://blog.51cto.com/ecloud/1436786 ,如需转载请自行联系原作者





相关文章
|
Oracle 关系型数据库 数据库
实战篇:Oracle 数据坏块的 N 种修复方式
实战篇:Oracle 数据坏块的 N 种修复方式
实战篇:Oracle 数据坏块的 N 种修复方式
|
3月前
|
Oracle 关系型数据库 数据库
数据库数据恢复—Oracle数据库文件出现坏块的数据恢复案例
打开oracle数据库报错“system01.dbf需要更多的恢复来保持一致性,数据库无法打开”。 数据库没有备份,无法通过备份去恢复数据库。用户方联系北亚企安数据恢复中心并提供Oracle_Home目录中的所有文件,急需恢复zxfg用户下的数据。 出现“system01.dbf需要更多的恢复来保持一致性”这个报错的原因可能是控制文件损坏、数据文件损坏,数据文件与控制文件的SCN不一致等。数据库恢复工程师对数据库文件进一步检测、分析后,发现sysaux01.dbf文件损坏,有坏块。 修复并启动数据库后仍然有许多查询报错,export和data pump工具使用报错。从数据库层面无法修复数据库。
数据库数据恢复—Oracle数据库文件出现坏块的数据恢复案例
|
存储 网络协议 Oracle
Oracle 12c rman duplicate网卡offload引起文件坏块问题
Oracle 12c rman duplicate网卡offload引起文件坏块问题
264 0
|
Oracle 关系型数据库 数据库
|
Oracle 关系型数据库 数据库
|
Oracle 关系型数据库 数据库管理
|
Oracle 关系型数据库 数据库