[20160830]清除日志与跟踪文件.txt
--我们数据库的dataguard磁盘空间非常紧张,前几天因为一些异常业务操作,导致dataguard磁盘空间不足,
--日志切换情况:
Date Day Total H0 h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20 h21 h22 h23 Avg
------------------- ------ ----- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- -------
2016-08-24 00:00:00 Wed 2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .08
2016-08-23 00:00:00 Tue 52 0 0 0 0 0 4 0 0 1 2 1 1 2 0 1 1 3 1 2 16 17 0 0 0 2.17
2016-08-22 00:00:00 Mon 13 0 0 0 0 0 4 0 0 0 0 1 0 1 0 2 0 1 1 0 2 1 0 0 0 .54
2016-08-21 00:00:00 Sun 4 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .17
2016-08-20 00:00:00 Sat 10 0 1 0 0 2 4 0 0 0 1 0 0 0 0 0 2 0 0 0 0 0 0 0 0 .42
--//一旦出现dataguard磁盘空间不足,主库的日志切换就非常频繁,注意看8/23日19点-20点的日志切换情况.实际上8-22甚至8-20号就已经
--//出现异常,正常1天切换4-6个(我们日志文件设置很大,1个6G,5点切换4个那是因为做备份)
--异常的业务操作已经交给开发解决,我开始清理跟踪文件以及一些日志.当检查目录/u01/app/oracle/admin/dbcndg/adump时我发现:
--这个目录占用5g的空间.我进入目录执行如下:
ls -l
或者
ls -l | head
--甚至出现假死的情况.没有办法我只能暂时关闭dg,直接rm -rf /u01/app/oracle/admin/dbcndg/adump.
--然后建立新的/u01/app/oracle/admin/dbcndg/adump目录,启动dg数据库.继续观察:
# ls -ltr
..
-rw-r----- 1 oracle oinstall 939 2016-08-24 15:55:25 dbcndg_ora_5520_20160824155525803511143795.aud
-rw-r----- 1 oracle oinstall 939 2016-08-24 15:55:40 dbcndg_ora_5526_20160824155540809233143795.aud
-rw-r----- 1 oracle oinstall 939 2016-08-24 15:55:55 dbcndg_ora_5530_20160824155555805718143795.aud
-rw-r----- 1 oracle oinstall 939 2016-08-24 15:56:10 dbcndg_ora_5533_20160824155610804032143795.aud
-rw-r----- 1 oracle oinstall 939 2016-08-24 15:56:25 dbcndg_ora_5535_20160824155625805177143795.aud
-rw-r----- 1 oracle oinstall 939 2016-08-24 15:56:40 dbcndg_ora_5540_20160824155640806221143795.aud
-rw-r----- 1 oracle oinstall 939 2016-08-24 15:56:55 dbcndg_ora_5552_20160824155655804836143795.aud
--//15秒有一个aud,导致磁盘慢慢消耗。实际上这个非常恐怖,1天产生86400/15=5760个文件,2天就1万多个.怪不得ls 都无法执行.
--//仔细检查内容,发现12c 不断的在登陆检查dg库,建议修改检查的频次,个人认为5分钟比较合适.
--//btw:我同事修改1小时,这个好像又有点过了.
--这让我想到如何定期清理的问题?我可以写脚本定期执行,或者使用logrotate来清理.突然想起来oracle本身就带有这个功能,只不过时间间隔太长.
--反正这个是dg,一般这些信息保留10天足够了.
$ rlwrap adrci
ADRCI: Release 11.2.0.4.0 - Production on Wed Aug 31 10:16:38 2016
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
ADR base = "/u01/app/oracle"
adrci> show homes
ADR Homes:
diag/rdbms/dummy/book
diag/rdbms/book/book
diag/rdbms/book1/book
adrci> set homes diag/rdbms/book/book
adrci> show control
ADR Home = /u01/app/oracle/diag/rdbms/book/book:
*************************************************************************
ADRID SHORTP_POLICY LONGP_POLICY LAST_MOD_TIME LAST_AUTOPRG_TIME LAST_MANUPRG_TIME ADRDIR_VERSION ADRSCHM_VERSION ADRSCHMV_SUMMARY ADRALERT_VERSION CREATE_TIME
---------- ------------- ------------ --------------------------------- --------------------------------- --------------------------------- -------------- --------------- ---------------- ---------------- ---------------------------------
2363806166 720 8760 2015-11-24 09:10:20.616251 +08:00 2016-08-31 08:41:06.737564 +08:00 2016-05-16 08:44:53.358797 +08:00 1 2 80 1 2015-11-24 09:10:20.616251 +08:00
1 rows fetched
--//里面的单位是小时.
--//确定进程号。
$ pgrep adrci
53405
--//使用strace跟踪:
$ strace -p 53405
--可以发现要打开
open("/u01/app/oracle/diag/rdbms/book/book/metadata/ADR_CONTROL.ams", O_RDONLY|O_SYNC|O_DIRECT) = 3
SCOTT@book> @ &r/10to16 2363806166
10 to 16 HEX REVERSE16
-------------- -----------------------------------
000008ce4d1d6 0xd6d1e48c-00000000
SCOTT@book> @ &r/10to16 720
10 to 16 HEX REVERSE16
-------------- -----------------------------------
00000000002d0 0xd0020000-00000000
SCOTT@book> @ &r/10to16 8760
10 to 16 HEX REVERSE16
-------------- -----------------------------------
0000000002238 0x38220000-00000000
--//通过bvi查看文件/u01/app/oracle/diag/rdbms/book/book/metadata/ADR_CONTROL.ams.
0000BED0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0000BEE0 00 00 00 00 00 00 00 00 00 6C 00 0B 00 04 8C E4 .........l......
~~~~~
0000BEF0 D1 D6 04 00 00 02 D0 04 00 00 22 38 0D 78 73 0B .........."8.xs.
~~~~~ ~~~~~`~~~~~ ~~~~~~~~~~~
0000BF00 18 02 0B 15 24 BB 3E 78 1C 3C 0D 78 74 08 1F 01 ....$.>x.<.xt...
0000BF10 2A 07 2B F6 55 60 1C 3C 0D 78 74 05 10 01 2D 36 *.+.U`.<.xt...-6
0000BF20 15 62 CE C8 1C 3C FF 04 00 00 00 02 04 00 00 00 .b...<..........
0000BF30 50 FF 0D 78 73 0B 18 02 0B 15 24 BB 3E 78 1C 3C P..xs.....$.>x.<
0000BF40 6C 00 0B 00 04 8C E4 D1 D6 04 00 00 02 D0 04 00 l...............
0000BF50 00 22 38 0D 78 73 0B 18 02 0B 15 24 BB 3E 78 1C . 8.xs.....$.>x.
0000BF60 3C 0D 78 74 05 0F 02 18 1D 33 47 1E F8 1C 3C FF <.xt.....3G...<.
0000BF70 FF 04 00 00 00 02 04 00 00 00 50 FF 0D 78 73 0B ..........P..xs.
0000BF80 18 02 0B 15 24 BB 3E 78 1C 3C 6C 00 0B 00 04 8C ....$.>x.<l.....
0000BF90 E4 D1 D6 04 00 00 02 D0 04 00 00 22 38 0D 78 73 ..........."8.xs
0000BFA0 0B 18 02 0B 15 24 BB 3E 78 1C 3C FF FF FF 04 00 .....$.>x.<.....
0000BFB0 00 00 02 04 00 00 00 50 FF 0D 78 73 0B 18 02 0B .......P..xs....
0000BFC0 15 24 BB 3E 78 1C 3C AC 00 06 01 00 01 00 00 00 .$.>x.<.........
0000BFD0 00 05 00 00 00 00 00 00 00 00 02 FF FF 06 FF FF ................
0000BFE0 00 00 00 00 06 FF FF 00 00 00 00 06 FF FF 00 00 ................
0000BFF0 00 00 06 FF FF 00 00 00 00 06 FF FF 00 00 00 00 ................
0000C000 06 01 05 00 0C 00 00 00 1B 2D 00 00 01 00 00 00 .........-......
0000C010 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0000C020 00 00 00 00 01 00 02 00 00 00 00 00 00 00 00 00 ................
--//仔细可以找到对应信息,也就是修改仅仅修改这些文件。
--//~前面的0x04 估计表示内容占用长度。
$ cd /u01/app/oracle/admin/book/adump
$ touch -d 2016/08/10 aaa
$ ls -l aaa
-rw-r--r-- 1 oracle oinstall 0 2016-08-10 00:00:00 aaa
$ ls -l *.aud |wc
5299 42392 529192
set control (SHORTP_POLICY = 1)
set control (LONGP_POLICY = 2)
adrci> show control
ADR Home = /u01/app/oracle/diag/rdbms/book/book:
*************************************************************************
ADRID SHORTP_POLICY LONGP_POLICY LAST_MOD_TIME LAST_AUTOPRG_TIME LAST_MANUPRG_TIME ADRDIR_VERSION ADRSCHM_VERSION ADRSCHMV_SUMMARY ADRALERT_VERSION CREATE_TIME
---------- ------------- ------------ --------------------------------- --------------------------------- --------------------------------- -------------- --------------- ---------------- ---------------- ---------------------------------
2363806166 1 2 2016-08-31 10:41:12.862865 +08:00 2016-08-31 08:41:06.737564 +08:00 2016-08-31 10:51:14.415887 +08:00 1 2 80 1 2015-11-24 09:10:20.616251 +08:00
1 rows fetched
--//通过bvi查看文件/u01/app/oracle/diag/rdbms/book/book/metadata/ADR_CONTROL.ams.
0000BEE0 00 00 00 00 00 00 00 00 00 6C 00 0B 00 04 8C E4 .........l......
0000BEF0 D1 D6 04 00 00 00 01 04 00 00 00 02 0D 78 74 08 .............xt.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0000BF00 1F 03 2A 0D 33 6E 46 68 1C 3C 0D 78 74 08 1F 01 ..*.3nFh.<.xt...
0000BF10 2A 07 2B F6 55 60 1C 3C 0D 78 74 05 10 01 2D 36 *.+.U`.<.xt...-6
0000BF20 15 62 CE C8 1C 3C FF 04 00 00 00 02 04 00 00 00 .b...<..........
--可以发现修改已经生效。手工执行看看:
adrci> purge
adrci>
$ ls -l *.aud |wc
5299 42392 529192
--昏,并不能清除adump的文件。仅仅能清除trace下的文件。
$ ls -l /u01/app/oracle/diag/rdbms/book/book/trace
total 6608
-rw-r----- 1 oracle oinstall 249061 2016-08-31 10:40:03 alert_book.log
-rw-r----- 1 oracle oinstall 1416938 2016-03-01 09:20:17 alert_book.log_20160301
-rw-r----- 1 oracle oinstall 4717869 2016-08-17 10:29:59 alert_book.log_20160817
-rw-r----- 1 oracle oinstall 315449 2016-08-31 10:44:39 book_mmon_48622.trc
-rw-r----- 1 oracle oinstall 33347 2016-08-31 10:44:39 book_mmon_48622.trm
$ cd /u01/app/oracle/diag/rdbms/book/book/trace
$ touch -d '2016/08/10' aaa
$ ls -l aaa
-rw-r--r-- 1 oracle oinstall 0 2016-08-10 00:00:00 aaa
adrci> purge
$ ls -l aaa
-rw-r--r-- 1 oracle oinstall 0 2016-08-10 00:00:00 aaa
--//依旧存在。
$ mv aaa book_mmon_48623.trc
adrci> purge
$ ls -l aaa
ls: aaa: No such file or directory
--看来要满足特定的格式。
--//从帮助看也可以确定purge并不清楚adump命令。
adrci> help purge
Usage: PURGE [[-i <id1> | <id1> <id2>] |
[-age <mins> [-type ALERT|INCIDENT|TRACE|CDUMP|HM|UTSCDMP]]]:
Purpose: Purge the diagnostic data in the current ADR home. If no
option is specified, the default purging policy will be used.
Options:
[-i id1 | id1 id2]: Users can input a single incident ID, or a
range of incidents to purge.
[-age <mins>]: Users can specify the purging policy either to all
the diagnostic data or the specified type. The data older than <mins>
ago will be purged
[-type ALERT|INCIDENT|TRACE|CDUMP|HM|UTSCDMP]: Users can specify what type of
data to be purged.
Examples:
purge
purge -i 123 456
purge -age 60 -type incident
--看来要清除adump目录的内容,要采用别的方式,写着写着又乱了。仅当做随笔吧!
--收尾工作:
adrci> set control (LONGP_POLICY = 720)
adrci> set control (SHORTP_POLICY = 240)