一套Linux上的10.2.0.4系统在异常恢复后(使用_allow_resetlogs_corruption隐藏参数打开后遭遇ORA-00600:[40xx]相关的内部错误,创建并切换到了新的撤销表空间上)出现ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []内部错误,当该非内部错误(non-fatal)出现100次以上时会在告警日志alert.log中出现记录。 并有可能导致实例crash,具体日志如下:
Errors in file /s01/10gdb/admin/clinica/bdump/clinica_smon_21463.trc: ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], [] Tue Jan 4 23:13:19 2011 Non-fatal internal error happenned while SMON was doing logging scn->time mapping. SMON encountered 1 out of maximum 100 non-fatal internal errors. clinica_smon_21463.trc: Dump of buffer cache at level 4 for tsn=1, rdba=8388633 BH (0x91fdf428) file#: 2 rdba: 0x00800019 (2/25) class: 19 ba: 0x91c62000 set: 3 blksize: 8192 bsi: 0 set-flg: 0 pwbcnt: 0 dbwrid: 0 obj: -1 objn: 0 tsn: 1 afn: 2 hash: [fcf7dd68,fcf7dd68] lru: [91fdf5b8,91fdf398] ckptq: [NULL] fileq: [NULL] objq: [f5b53d60,f5b53d60] use: [fa694970,fa694970] wait: [NULL] st: XCURRENT md: SHR tch: 0 flags: gotten_in_current_mode LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [65535] buffer tsn: 1 rdba: 0x00800019 (2/25) scn: 0x0000.0352d07c seq: 0x01 flg: 0x00 tail: 0xd07c2601 frmt: 0x02 chkval: 0x0000 type: 0x26=KTU SMU HEADER BLOCK /* 这里dump了一个tsn=1,file#=2的数据块, 可以看到它的类型是KTU SMU HEADER BLOCK即某个回滚段头 */ Hex dump of block: st=0, typ_found=1 ........................ ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], [] Current SQL statement for this session: insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas, num_mappings, tim_scn_map) values (0, :1, :2, :3, :4, :5, :6, :7) ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedst()+31 call ksedst1() 000000000 ? 000000001 ? 7FFFF53BC160 ? 7FFFF53BC1C0 ? 7FFFF53BC100 ? 000000000 ? ksedmp()+610 call ksedst() 000000000 ? 000000001 ? 7FFFF53BC160 ? 7FFFF53BC1C0 ? 7FFFF53BC100 ? 000000000 ? ksfdmp()+21 call ksedmp() 000000003 ? 000000001 ? 7FFFF53BC160 ? 7FFFF53BC1C0 ? 7FFFF53BC100 ? 000000000 ? kgeriv()+176 call ksfdmp() 000000003 ? 000000001 ? 7FFFF53BC160 ? 7FFFF53BC1C0 ? 7FFFF53BC100 ? 000000000 ? kgesiv()+119 call kgeriv() 0068C97C0 ? 2ABDF1D42BF0 ? 000000000 ? 0F4A33EA0 ? 7FFFF53BC100 ? 000000000 ? ksesic0()+209 call kgesiv() 0068C97C0 ? 2ABDF1D42BF0 ? 000001001 ? 000000000 ? 7FFFF53BCEE0 ? 000000000 ? ktugti()+3200 call ksesic0() 000001001 ? 0068C9940 ? 000000000 ? 00000009A ? 000000010 ? 101010101010101 ? ktsftcmove()+4149 call ktugti() 0B73F111C ? 7FFFF53BD278 ? 7FFFF53BD280 ? 000000000 ? 7FFFF53BD27C ? 7FFFF53BD270 ? ktsf_gsp()+1937 call ktsftcmove() 00000000A ? 000000000 ? 000000000 ? 000000000 ? 7FFFF53BD27C ? 7FFFF53BD270 ? kdtgsp()+512 call ktsf_gsp() 000000000 ? 7FFFF53BF460 ? 000000024 ? 000000002 ? 7FFFF53BF460 ? 000000000 ? kdccak()+111 call kdtgsp() 2ABDF1D6A2D8 ? 7FFF00000000 ? 2ABDF1D68530 ? 000000002 ? 7FFFF53BF460 ? 000000000 ? kdcgcs()+5419 call kdccak() 2ABDF1D6A2D8 ? 000000001 ? 0F4A3BBA8 ? 000000000 ? 2ABDF1D6A370 ? 000000000 ? kdcgsp()+1372 call kdcgcs() 2ABDF1D6A2D8 ? 000000001 ? 0F4A3BBA8 ? 000000000 ? 2ABDF1D6A370 ? 000000000 ? kdtInsRow()+1808 call kdcgsp() 2ABDF1D6A2D8 ? 000000001 ? 0F4A3BBA8 ? 000000000 ? 2ABDF1D6A370 ? 000000000 ? insrow()+342 call kdtInsRow() 2ABDF1D6A2D8 ? 000000001 ? 0F4A3BBA8 ? 000000000 ? 2ABDF1D6A370 ? 000000000 ? insdrv()+594 call insrow() 2ABDF1D6A2D8 ? 7FFFF53BFCC8 ? 000000000 ? 0F4A33DE0 ? 2ABDF1D6A370 ? 000000000 ? inscovexe()+404 call insdrv() 2ABDF1D6A2D8 ? 7FFFF53BFCC8 ? 000000000 ? 2ABDF1D6D908 ? 2ABDF1D6A370 ? 000000000 ? insExecStmtExecIniE call inscovexe() 0F4A33DE0 ? 0F4A3C230 ? ngine()+85 7FFFF53C0EF0 ? 2ABDF1D69F20 ? 2ABDF1D6A370 ? 000000000 ? insexe()+386 call insExecStmtExecIniE 0F4A33DE0 ? 0F4A3C230 ? ngine() 2ABDF1D69F20 ? 2ABDF1D69F20 ? 2ABDF1D6A370 ? 000000000 ? opiexe()+9182 call insexe() 0F4A333A8 ? 7FFFF53C0EF0 ? 0F4A33DE0 ? 2ABDF1D69F20 ? 2ABDF1D6A370 ? 2ABDF1D69F20 ? opiall0()+1842 call opiexe() 000000049 ? 000000003 ? 7FFFF53C12F8 ? 000000001 ? ..............针对该ORA-00600:[4097]内部错误,metalink上Note [ID 1030620.6]介绍了一种workaround的方法:
An ORA-600 [4097] can be encountered through various activities that use rollback segments. Solution Description: ===================== The most likely cause of this is BUG 427389. This BUG is fixed in version 7.3.3.3. The BUG is caused when Rollback Segments are dropped and recreated after a shutdown abort. It is encountered through a very specific set of circumstances: When an instance has a rollback segment offline and the instance crashes, or the user does a shutdown abort, the rollback segment wrap number does not get updated. If that segment is then dropped and recreated immediately after the instance is restarted, the wrap number could be lower than existing wrap numbers. This will cause the ORA-600[4097] to occur in subsequent transactions using Rollback. To avoid encountering this bug, rollback segments should only be dropped and recreated after the instance has been shutdown normal and restarted. If you have already encountered the bug, use the following workaround: Select segment_name, segment_id from dba_rollback_segs; Drop all Rollback Segments except for SYSTEM. Recreate dummy (small) rollback segments with the same names in their place. Then, recreate additional rollback segments you want to keep with their permanent storage parameters. Now drop the dummy ones. This should ensure that the segment_ids are not reused. If you ever want to add a rollback segment you have to use the workaround steps again. If you do not fill the dummy slots you may see the problem re-appear.我们可以尝试drop异常恢复前已有的可能存在问题的rollback segment来规避这个问题,虽然在10g下使用AMU(automatic managed undo)但仍可以做到这一点:
SQL> alter system set "_smu_debug_mode"=4; System altered. /* 设置SMU debug模式为4以便能够手动管理回滚段 */ SQL> set heading off SQL> select 'drop rollback segment "'||segment_name||'";' from dba_rollback_segs where segment_name!='SYSTEM'; drop rollback segment "_SYSSMU1$"; drop rollback segment "_SYSSMU2$"; drop rollback segment "_SYSSMU3$"; drop rollback segment "_SYSSMU4$"; drop rollback segment "_SYSSMU5$"; drop rollback segment "_SYSSMU6$"; drop rollback segment "_SYSSMU7$"; drop rollback segment "_SYSSMU8$"; drop rollback segment "_SYSSMU9$"; drop rollback segment "_SYSSMU10$"; drop rollback segment "_SYSSMU11$"; drop rollback segment "_SYSSMU12$"; drop rollback segment "_SYSSMU13$"; drop rollback segment "_SYSSMU14$"; drop rollback segment "_SYSSMU15$"; drop rollback segment "_SYSSMU16$"; drop rollback segment "_SYSSMU17$"; drop rollback segment "_SYSSMU18$"; drop rollback segment "_SYSSMU19$"; drop rollback segment "_SYSSMU20$"; drop rollback segment "_SYSSMU21$"; drop rollback segment "_SYSSMU22$"; drop rollback segment "_SYSSMU23$"; drop rollback segment "_SYSSMU24$"; drop rollback segment "_SYSSMU25$"; drop rollback segment "_SYSSMU26$"; drop rollback segment "_SYSSMU27$"; drop rollback segment "_SYSSMU28$"; drop rollback segment "_SYSSMU29$"; drop rollback segment "_SYSSMU30$"; 30 rows selected. /* 依次执行以上的drop rollback segment回滚段的命令 注意当前撤销表空间上的回滚段仅能offline而无法drop掉, 实际上我们需要做的也仅仅是把之前undo表空间上有问题的回滚段drop掉 */ SQL> alter rollback segment "_SYSSMU30$" offline; Rollback segment altered. SQL> drop rollback segment "_SYSSMU30$"; drop rollback segment "_SYSSMU30$" * ERROR at line 1: ORA-30025: DROP segment '_SYSSMU30$' (in undo tablespace) not allowed SQL> alter rollback segment "_SYSSMU30$" online; Rollback segment altered.
经过以上drop问题回滚段rollback segment后,系统不再出现ORA-00600:[4097]内部错误,实例恢复正常。在系统正常后,我们有必要重置之前所设的"_smu_debug_mode"UNDO管理debug模式的隐藏参数。
本文转自maclean_007 51CTO博客,原文链接:http://blog.51cto.com/maclean/1277686