如果你得到反馈,数据库突然间性能下降了好多,希望你能够尽快的定位出问题来,有一些思路和方法可以参考。分别从数据库层面,系统层面来定位,但是个人感觉而言还是不够快和准。
因为绝大多数的问题都是由于active session导致的,所以我们的注意力集中在ash是比较合理的。 ASH在这个时候就是一个利器,通过它能够得到几乎实时的数据库变化。
相比而言我们通过ashrpt得到ash的报告来诊断问题理论上可行,但是有个缺点就是不够直观。报告里面的描述着实很详细,有时候是有优点有时候可能就是缺点。
公司的同事写了如下的脚本,个人在使用中感觉非常的直观,定位问题真有一目了然的感觉。
脚本内容如下:
sqlplus -s $DB_CONN_STR@$SH_DB_SID set lin 200
set pages 50
col SID for 99999 trunc
col running_sec for a11 head "ELAP_SEC"
col inst_id for 9 trunc head "I"
col serial# for 99999 trunc head SER#
col username for a12 trunc head "USERNAME"
col osuser for a10 trunc head "OSUSER"
col status for a3 trunc head "STAT"
col machine for a10 trunc
col process for a7 trunc head "RPID"
col spid for a6 trunc head "SPID"
col program for a20 trunc
col module for a13 trunc
col temp_mb for 999999 head "TEMP_MB"
col undo_mb for 999999 head "UNDO_MB"
col logon_time for a11
col rm_grp for a6 trunc
col sql_id for a13
col sql for a49 trunc
col tsps for a6 trunc
SELECT /* use_hash(sess,proc,undo,tmp) use_nl(s)*/ distinct
sess.inst_id,
sess.sid,
sess.serial#,
sess.username,
substr(osuser,1,10) osuser,
status,
sess.process,
proc.spid,
sess.machine,
sess.program,
regexp_substr(NUMTODSINTERVAL(nvl((SYSDATE-SQL_EXEC_START)*24*60*60,last_call_et), 'SECOND'),'+\d{2} \d{2}:\d{2}:\d{2}') running_sec,
TEMP_MB, UNDO_MB,
s.sql_id ,
TSPS.NAME TSPS,
decode(sess.action,null,'',sess.action||', ')||replace(s.sql_text,chr(13),' ') sql
FROM
gv\$session sess,
gv\$process proc,
gv\$sql s,
(select ses_addr as saddr,sum(used_ublk/128) UNDO_MB from v\$transaction group by ses_addr) undo,
(select session_addr as saddr, SESSION_NUM serial#, sum((blocks/128)) TEMP_MB from gv\$sort_usage group by session_addr, SESSION_NUM) tmp,
(select inst_id,sid,serial#,event,t.name from gv\$session ls, sys.file$ f, sys.ts$ t where status='ACTIVE' and ls.p1text in ('file number','file#') and ls.p1=f.file# and f.ts#=t.ts#) tsps
WHERE sess.inst_id=proc.inst_id (+)
and sess.saddr=tmp.saddr (+) and sess.serial#=tmp.serial# (+)
AND sess.status='ACTIVE' and sess.username is not null
and sess.sid=tsps.sid (+) and sess.inst_id=tsps.inst_id(+) and sess.serial#=tsps.serial#(+)
AND sess.paddr=proc.addr (+)
and sess.sql_id = s.sql_id (+)
and sess.saddr=undo.saddr (+)
ORDER BY running_sec desc,4,1,2,3
;
EOF
运行脚本的结果如下:
I SID SER# USERNAME OSUSER STA RPID SPID MACHINE PROGRAM ELAP_SEC TEMP_MB UNDO_MB SQL_ID TSPS SQL
-- ------ ------ ------------ ---------- --- ------- ------ ---------- -------------------- ----------- ------- ------- ------------- ------ -------------------------------------------------
1 14889 55175 PRDAPPC cowrk01 ACT 1234 23366 ccbdbpr1 JDBC Thin Client 04 11:44:12 519 648600hq1s1s8 UNDOTB SELECT ban_details.COMPANY_CODE, ban_detai
1 19 16945 PRDAPPC blwrk01 ACT 9442 9442 ccbdbpr1 oracle@ccbdbpr3 (P02 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 421 20337 PRDAPPC blwrk01 ACT 9444 9444 ccbdbpr1 oracle@ccbdbpr3 (P02 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 684 14023 PRDAPPC blwrk01 ACT 9446 9446 ccbdbpr1 oracle@ccbdbpr3 (P02 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 6502 24857 PRDAPPC blwrk01 ACT 9458 9458 ccbdbpr1 oracle@ccbdbpr3 (P03 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 8880 35991 PRDAPPC blwrk01 ACT 24531 25882 ccbdbpr1 sqlplus@ccbdbpr1 (TN 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 9536 26661 PRDAPPC truwl21 ACT 1234 6158 ccbappr2 JDBC Thin Client 00 00:06:38 DATAS0
1 14566 64567 PRDAPPC truwl25 ACT 1234 23179 ccbappr2 JDBC Thin Client 00 00:06:32 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
1 2799 36269 PRDAPPC truwld6 ACT 1234 1147 ccbappr13 JDBC Thin Client 00 00:05:40 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
1 3490 41433 CCBSFMDEV pcowrk01 ACT 4860 8520 ccbdbpr1 sqlplus@ccbdbpr1 (TN 00 00:03:38 gt64t81rs1yp6 DATAL0 insert into ccbsfmdev.ar9_temp_e2e_restore (msisd
1 2807 45859 PRDAPPC truwl45 ACT 1234 26921 ccbappr4 JDBC Thin Client 00 00:01:50 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
1 22051 15289 PRDAPPC truwl21 ACT 1234 17442 ccbappr2 JDBC Thin Client 00 00:01:40 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
可以从上面的信息中得出不少的内容。
首先是session对应的sql语句,哪些session在执行哪些语句,执行的时间都一目了然。
占用的表空间情况,从第一条记录可以看到,执行了近12个小时,占用了大量的Undo空间。
第2~4行正在执行的是一个并行查询,并行度为4,并行协调session是(8880,35991) 目前执行时间已经达5个小时,对于并行来说,还是存在问题,需要进一步分析。
因为绝大多数的问题都是由于active session导致的,所以我们的注意力集中在ash是比较合理的。 ASH在这个时候就是一个利器,通过它能够得到几乎实时的数据库变化。
相比而言我们通过ashrpt得到ash的报告来诊断问题理论上可行,但是有个缺点就是不够直观。报告里面的描述着实很详细,有时候是有优点有时候可能就是缺点。
公司的同事写了如下的脚本,个人在使用中感觉非常的直观,定位问题真有一目了然的感觉。
脚本内容如下:
sqlplus -s $DB_CONN_STR@$SH_DB_SID set lin 200
set pages 50
col SID for 99999 trunc
col running_sec for a11 head "ELAP_SEC"
col inst_id for 9 trunc head "I"
col serial# for 99999 trunc head SER#
col username for a12 trunc head "USERNAME"
col osuser for a10 trunc head "OSUSER"
col status for a3 trunc head "STAT"
col machine for a10 trunc
col process for a7 trunc head "RPID"
col spid for a6 trunc head "SPID"
col program for a20 trunc
col module for a13 trunc
col temp_mb for 999999 head "TEMP_MB"
col undo_mb for 999999 head "UNDO_MB"
col logon_time for a11
col rm_grp for a6 trunc
col sql_id for a13
col sql for a49 trunc
col tsps for a6 trunc
SELECT /* use_hash(sess,proc,undo,tmp) use_nl(s)*/ distinct
sess.inst_id,
sess.sid,
sess.serial#,
sess.username,
substr(osuser,1,10) osuser,
status,
sess.process,
proc.spid,
sess.machine,
sess.program,
regexp_substr(NUMTODSINTERVAL(nvl((SYSDATE-SQL_EXEC_START)*24*60*60,last_call_et), 'SECOND'),'+\d{2} \d{2}:\d{2}:\d{2}') running_sec,
TEMP_MB, UNDO_MB,
s.sql_id ,
TSPS.NAME TSPS,
decode(sess.action,null,'',sess.action||', ')||replace(s.sql_text,chr(13),' ') sql
FROM
gv\$session sess,
gv\$process proc,
gv\$sql s,
(select ses_addr as saddr,sum(used_ublk/128) UNDO_MB from v\$transaction group by ses_addr) undo,
(select session_addr as saddr, SESSION_NUM serial#, sum((blocks/128)) TEMP_MB from gv\$sort_usage group by session_addr, SESSION_NUM) tmp,
(select inst_id,sid,serial#,event,t.name from gv\$session ls, sys.file$ f, sys.ts$ t where status='ACTIVE' and ls.p1text in ('file number','file#') and ls.p1=f.file# and f.ts#=t.ts#) tsps
WHERE sess.inst_id=proc.inst_id (+)
and sess.saddr=tmp.saddr (+) and sess.serial#=tmp.serial# (+)
AND sess.status='ACTIVE' and sess.username is not null
and sess.sid=tsps.sid (+) and sess.inst_id=tsps.inst_id(+) and sess.serial#=tsps.serial#(+)
AND sess.paddr=proc.addr (+)
and sess.sql_id = s.sql_id (+)
and sess.saddr=undo.saddr (+)
ORDER BY running_sec desc,4,1,2,3
;
EOF
运行脚本的结果如下:
I SID SER# USERNAME OSUSER STA RPID SPID MACHINE PROGRAM ELAP_SEC TEMP_MB UNDO_MB SQL_ID TSPS SQL
-- ------ ------ ------------ ---------- --- ------- ------ ---------- -------------------- ----------- ------- ------- ------------- ------ -------------------------------------------------
1 14889 55175 PRDAPPC cowrk01 ACT 1234 23366 ccbdbpr1 JDBC Thin Client 04 11:44:12 519 648600hq1s1s8 UNDOTB SELECT ban_details.COMPANY_CODE, ban_detai
1 19 16945 PRDAPPC blwrk01 ACT 9442 9442 ccbdbpr1 oracle@ccbdbpr3 (P02 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 421 20337 PRDAPPC blwrk01 ACT 9444 9444 ccbdbpr1 oracle@ccbdbpr3 (P02 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 684 14023 PRDAPPC blwrk01 ACT 9446 9446 ccbdbpr1 oracle@ccbdbpr3 (P02 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 6502 24857 PRDAPPC blwrk01 ACT 9458 9458 ccbdbpr1 oracle@ccbdbpr3 (P03 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 8880 35991 PRDAPPC blwrk01 ACT 24531 25882 ccbdbpr1 sqlplus@ccbdbpr1 (TN 02 05:35:02 b9xg175fbzuk5 INSERT INTO TRUE9_BILL_SUMMARY (CYCLE_SEQ_NO, PAY
1 9536 26661 PRDAPPC truwl21 ACT 1234 6158 ccbappr2 JDBC Thin Client 00 00:06:38 DATAS0
1 14566 64567 PRDAPPC truwl25 ACT 1234 23179 ccbappr2 JDBC Thin Client 00 00:06:32 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
1 2799 36269 PRDAPPC truwld6 ACT 1234 1147 ccbappr13 JDBC Thin Client 00 00:05:40 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
1 3490 41433 CCBSFMDEV pcowrk01 ACT 4860 8520 ccbdbpr1 sqlplus@ccbdbpr1 (TN 00 00:03:38 gt64t81rs1yp6 DATAL0 insert into ccbsfmdev.ar9_temp_e2e_restore (msisd
1 2807 45859 PRDAPPC truwl45 ACT 1234 26921 ccbappr4 JDBC Thin Client 00 00:01:50 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
1 22051 15289 PRDAPPC truwl21 ACT 1234 17442 ccbappr2 JDBC Thin Client 00 00:01:40 fg5mc598u799u DATAS0 select /*+ leading (bpm_ai bpm_pi) use_nl (bpm
首先是session对应的sql语句,哪些session在执行哪些语句,执行的时间都一目了然。
占用的表空间情况,从第一条记录可以看到,执行了近12个小时,占用了大量的Undo空间。
第2~4行正在执行的是一个并行查询,并行度为4,并行协调session是(8880,35991) 目前执行时间已经达5个小时,对于并行来说,还是存在问题,需要进一步分析。