原创转载起注明出处
我们可能有时候会遇到如下错误
一、错误展示
SQL> select * from test123;
select * from test123
ORA-01578: ORACLE data block corrupted (file # 1, block # 61658)
ORA-01110: data file 1: '/oracle_ocfs/oradata/pprac/system01.dbf'
二、原因
这样的情况下可能是因为底层的硬件问题引起的,在进行读物理磁盘数据块的时候会进行物理一致性校验,也就是会计算上次写入磁盘的
时候的校验和和这次读取时候的校验和是否相同,不同则出现错误1578,参数db_block_checksum来控制这个功能10G默认是TRUE也就是对
所有文件进行校验如果设置为FALSE就是只对SYSTEM文件进行校验。当然还涉及到逻辑检验,但是这个错误是物理检验出错引起的。
三、校验工具
而我们有那些方法进行数据文件进行坏块校验哪?
1、DBV工具
命令和结果如下
[oracle@b000-vmracdb1 lib]$ dbv file=/oracle_ocfs/oradata/pprac/system01.dbf
DBVERIFY: Release 10.2.0.1.0 - Production on Mon Aug 9 16:43:53 2010
Copyright (c) 1982, 2005, Oracle. All rights reserved.
DBVERIFY - Verification starting : FILE = /oracle_ocfs/oradata/pprac/system01.dbf
Page 61658 is marked corrupt
Corrupt block relative dba: 0x0040f0da (file 1, block 61658)
Bad check value found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x0040f0da
last change scn: 0x0000.00d056c1 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x56c10601
check value in block header: 0x6988
computed block checksum: 0xde02
DBVERIFY - Verification complete
Total Pages Examined : 64000
Total Pages Processed (Data) : 38268
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 7142
Total Pages Failing (Index): 0
Total Pages Processed (Other): 1901
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 16688
Total Pages Marked Corrupt : 1
Total Pages Influx : 0
Highest block SCN : 13691243 (0.13691243)
这里Total Pages Marked Corrupt : 1说明有了坏块
2、RMAN工具
使用命令
backup check logical validate database;
同时对物理和逻辑进行校验
然后我们可以查看视图v$database_block_corruption,结果如下
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTION_TYPE
---------- ---------- ---------- ------------------ ---------------
1 61658 1 0 CHECKSUM
3、ANALYZE命令
命令是:ANALYZE TABLE TEST123 VALIDATE STRUCTURE CASCADE ONLINE;
四、模拟错误
为了试验我们使用BBED工具进行模拟,千万不要用这个工具在生产上玩哦。同时要建立一个表如下:
create table test123
(id number,name char(2000));
然后插入数据我的数据如下:
ID NAME
---------- --------------------------------------------------------------------------------
1 test
2 test2
3 test3
4 test4
5 test5
6 test6
7 test7
8 test8
9 test9
10 test10
然后确定文件号和数据块好如下:
SQL> select dbms_rowid.rowid_relative_fno(rowid) file_num,
2 dbms_rowid.rowid_block_number(rowid) block_num,id,name from test123;
FILE_NUM BLOCK_NUM ID NAME
---------- ---------- ---------- --------------------------------------------------------------------------------
1 61658 1 test
1 61658 2 test2
1 61658 3 test3
1 61659 4 test4
1 61659 5 test5
1 61659 6 test6
1 61660 7 test7
1 61660 8 test8
1 61660 9 test9
1 61661 10 test10
这里我准备用1,61658来试验
下面使用BBED来进行模拟错误:
(1)安装BBED目录在$ORACLE_HOME/rdbms/lib下面使用make -f ins_rdbms.mk $ORACLE_HOME/rdbms/lib/bbed来进行编译
(2)编写参数文件bbed.par如下:
[oracle@b000-vmracdb1 lib]$ more bbed.par
blocksize=8192
listfile=list
mode=edit
(3)编写list文件,里面放数据文件列表如下:
[oracle@b000-vmracdb1 lib]$ more list
1 /oracle_ocfs/oradata/pprac/system01.dbf
2 /oracle_ocfs/oradata/pprac/undotbs01.dbf
3 /oracle_ocfs/oradata/pprac/sysaux01.dbf
4 /oracle_ocfs/oradata/pprac/users01.dbf
5 /oracle_ocfs/oradata/pprac/example01.dbf
6 /oracle_ocfs/oradata/pprac/undotbs02.dbf
(4)进入BBED使用命令
bbed parfile=bbed.par 密码就是blockedit
[oracle@b000-vmracdb1 lib]$ /home/oracle/product/database/rdbms/lib/bbed parfile=bbed.par
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Mon Aug 9 16:52:31 2010
Copyright (c) 1982, 2005, Oracle. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED>
(5)找到块命令SET
BBED> set dba 1,61658
DBA 0x0040f0da (4255962 1,61658)
(6)我们来找到以test为开头的记录
BBED> find /c test
File: /oracle_ocfs/oradata/pprac/system01.dbf (1)
Block: 61658 Offsets: 2170 to 2681 Dba:0x0040f0da
------------------------------------------------------------------------
74657374 33202020 20202020 20202020 20202020 20202020 20202020 20202020
20202020 20202020 20202020 20202020 20202020 20202020 20202020 20202020
20202020 20202020 20202020 20202020 20202020 20202020 20202020 20202020
。。。。。。
这里看到了第一个test偏移量2170,这个也就是说相对整个块的一个位置,我也没有深究。
然后找到他
BBED> dump /v dba 1,61658 offset 2170 count 64
File: /oracle_ocfs/oradata/pprac/system01.dbf (1)
Block: 61658 Offsets: 2170 to 2233 Dba:0x0040f0da
-------------------------------------------------------
74657374 33202020 20202020 20202020 l test3
20202020 20202020 20202020 20202020 l
20202020 20202020 20202020 20202020 l
20202020 20202020 20202020 20202020 l
<16 bytes="" per="" line="">
看到了是test123
然后找下一个
BBED> f
File: /oracle_ocfs/oradata/pprac/system01.dbf (1)
Block: 61658 Offsets: 4179 to 4242 Dba:0x0040f0da
------------------------------------------------------------------------
74657374 32202020 20202020 20202020 20202020 20202020 20202020 20202020
20202020 20202020 20202020 20202020 20202020 20202020 20202020 20202020
<32 bytes="" per="" line="">
偏移量4179
BBED> dump /v dba 1,61658 offset 4179 count 64
File: /oracle_ocfs/oradata/pprac/system01.dbf (1)
Block: 61658 Offsets: 4179 to 4242 Dba:0x0040f0da
-------------------------------------------------------
74657374 32202020 20202020 20202020 l test2
20202020 20202020 20202020 20202020 l
20202020 20202020 20202020 20202020 l
20202020 20202020 20202020 20202020 l
同样也找到了
在往下还有test能找到,但是我这里已经被我修改了,找不到了。
然后我们修改,弄坏数据块
modify 1000 dba 1,61658 这里是修改的偏移量就是最后找到的TEST*的偏移量我改是test
修改完成后这个块就坏了。
但是这里注意由于DB_CACHE_BUFFER的原因会缓存数据块,即使修改了读取的时候从SGA读出来的话也不报错
需要清空BUFFER或者重启下,不要让数据库高速缓存缓存数据块。
然后我们
SQL> select * from test123 where id=1;
ID NAME
---------- --------------------------------------------------------------------------------
1 test
SQL> select * from test123;
select * from test123
ORA-01578: ORACLE data block corrupted (file # 1, block # 61658)
ORA-01110: data file 1: '/oracle_ocfs/oradata/pprac/system01.dbf'
错误就出来了。
五、如何解决
然后我们就是来解决这个问题,说实话如果是表的数据块坏了真的没有太多的办法,只能先解救我们没有
损坏块的数据方法如下:
---有RMAN备份和归档的情况下:
这个比较简单,使用命令
blockrecover datafile 1 block 61658;
就可以了,无需进行不完全恢复。
----没RMAN备份和归档的情况下:
1、设置SKIP
SQL> begin
2 dbms_repair.skip_corrupt_blocks(schema_name => 'SYS',object_name => 'TEST123',object_type => DBMS_REPAIR.TABLE_OBJECT,flags => DBMS_REPAIR.SKIP_FLAG);
3 END;
4 /
也就是跳过损坏的块。
2、查看下是否是SKIP
SQL> select table_name,skip_corrupt from dba_tables where table_name='TEST123';
TABLE_NAME SKIP_CORRUPT
------------------------------ ------------
TEST123 ENABLED
3、再次查看
SQL> select * from test123;
ID NAME
---------- --------------------------------------------------------------------------------
4 test4
5 test5
6 test6
7 test7
8 test8
9 test9
10 test10
木有了坏块的数据已经没有了。如果想恢复失去的数据,找找ORACLE SUPPORT,或者数据挖掘,反正都是很麻烦的事情,而且不一定行。
当然如果是索引的坏就重建吧,这个没事。
如下仅仅记录一下处理坏块需要的技术
未标记损坏块时,未损坏的记录还能通过RIWID读出
backup validate database;
select * from v$database_block_corruption;
SELECT e.owner, e.segment_type, e.segment_name, e.partition_name, c.file#
, greatest(e.block_id, c.block#) corr_start_block#
, least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block#
, least(e.block_id+e.blocks-1, c.block#+c.blocks-1)
- greatest(e.block_id, c.block#) + 1 blocks_corrupted
, corruption_type description
FROM dba_extents e, v$database_block_corruption c
WHERE e.file_id = c.file#
AND e.block_id <= c.block# + c.blocks - 1
AND e.block_id + e.blocks - 1 >= c.block#
UNION
SELECT s.owner, s.segment_type, s.segment_name, s.partition_name, c.file#
, header_block corr_start_block#
, header_block corr_end_block#
, 1 blocks_corrupted
, corruption_type||' Segment Header' description
FROM dba_segments s, v$database_block_corruption c
WHERE s.header_file = c.file#
AND s.header_block between c.block# and c.block# + c.blocks - 1
UNION
SELECT null owner, null segment_type, null segment_name, null partition_name, c.file#
, greatest(f.block_id, c.block#) corr_start_block#
, least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block#
, least(f.block_id+f.blocks-1, c.block#+c.blocks-1)
- greatest(f.block_id, c.block#) + 1 blocks_corrupted
, 'Free Block' description
FROM dba_free_space f, v$database_block_corruption c
WHERE f.file_id = c.file#
AND f.block_id <= c.block# + c.blocks - 1
AND f.block_id + f.blocks - 1 >= c.block#
order by file#, corr_start_block#;
execute DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('SYS','TESTER',flags=>dbms_repair.skip_flag);
execute DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('SYS','TESTER',flags=>dbms_repair.noskip_flag);
11GR2
backup validate database;
RECOVER CORRUPTION LIST;
DRA 11G
backup validate database;
list failure;
advise failure;
repair failure preview;
repair failure;
可以使用下面语法代替RECOVER CORRUPTION LIST;11R2以前
run {blockrecover corruption list;}
修复单块
blockrecover datafile 1 block 61658;
ROWID_CREATE函数
function ROWID_CREATE(rowid_type IN number,
object_number IN number,
relative_fno IN number,
block_number IN number,
row_number IN number)
return ROWID;
如果是LOB块的错误需要另外的处理的方式
http://blog.itpub.net/7728585/viewspace-1580285/