日常巡检某客户生产环境数据库时,发现AWR报告中有资源消耗较高但是Executions=0的现象:
这个现象可以解释为在这一个小时的快照周期内SQL一次都没执行完成。
(一)
对第一条 SQLID=1cc0y3fyndjcv
UPDATE User1.Tab1 A SET (col1, col2, col3, col4, col5, col6, col7) = (SELECT UTL_RAW.CAST_TO_VARCHAR2(colx) col1, col2, col3, col4, col5, col6, col7 FROM User2.Tab2@DBLink1 B WHERE A.col1=B.col1 AND 1=1) WHERE col8 = '01'
可以看到有DBLINK1,经排查是跨实例的更新操作;两张表的数据量相同,进一步分析,B是一张视图
SQL> select count(*) from User1.Tab1;
COUNT(*)
----------
160366
SQL> select count(*) from User2.Tab2@DBLink1;
COUNT(*)
----------
160366
SQL> col object_name for a30
SQL> select object_name,object_type from dba_objects@DBLink1 where object_name = upper('Tab2');
OBJECT_NAME OBJECT_TYPE
------------------------------ -------------------
Tab2 VIEW
(二)
查看SQL的执行计划,全表扫描;在分析col8的选择性,整张表就一个值,没有选择性
SQL> select col1, col2, col3, col4, col5, col6, col7 from User1.Tab1 where col8 = '01' ;
160366 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 156998718
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 160K| 5480K| 473 (1)| 00:00:06 |
|* 1 | TABLE ACCESS FULL| Tab1 | 160K| 5480K| 473 (1)| 00:00:06 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("col8"='01')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
329996 consistent gets
0 physical reads
106944 redo size
4647458 bytes sent via SQL*Net to client
118125 bytes received via SQL*Net from client
10693 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
160366 rows processed
SQL> select count(*),col from User1.Tab1 group by col8;
COUNT(*) MFMANAUNIT
---------- --------------------
160366 01
(三)
从实例级别进行分析,脚本运行在当前数据库服务器host1上;查看服务器进程,ora_j000 是Oracle JOB进程;再看当前正在运行的JOB
SQL> @sql_now
SID STATUS PROCESS SCHEMANAME OSUSER SQL QUERY PROGRAM
---- -------- ---------- ---------------- ---------------- ------------------------------------------------------------------------------------------------------------------------ ------------------------------
226 ACTIVE 21800 DBUSRSYS oracle UPDATE User1.Tab1 A SET (col1, col2, col3, col4, col5, col6, col7) = (SELECT UTL_RAW.CAST_TO_VARCHAR2(colx) col1, col2, col3, col4, col5, col6, col7 FROM User2.Tab2@DBLink1 B WHERE A.col1=B.col1 AND 1=1) WHERE col8 = '01' oracle@host1
[root@host1~]# ps -ef|grep 21800
root 6954 5534 0 14:43 pts/2 00:00:00 grep 21800
oracle 21800 1 0 05:26 ? 00:03:25 ora_j000_orcl
SQL> select * from dba_jobs_running;
SID JOB FAILURES LAST_DATE LAST_SEC THIS_DATE THIS_SEC INSTANCE
---- ---------- ---------- --------- -------- --------- -------- ----------
226 21 0 02-JAN-19 01:00:30 03-JAN-19 05:26:53 0
JOB sid == SQL sid ,且状态都是正在运行,那么确定是JOB中调用的UPDATE操作;JOB运行了很长时间
(四)
查看JOB内容
SQL> select job,substr(what,1,100) from dba_jobs where job = 21;
JOB SUBSTR(WHAT,1,100)
---------- ----------------------------------------------------------------------------------------------------
21 package1.RUN;
SQL> select dbms_metadata.get_ddl('PACKAGE','package1','User1') from dual;
--------------------------------------------------------------------------------
CREATE OR REPLACE PACKAGE "User1"."package1" Is
Procedure writelog
(
vtab In Varchar2,
vflag In Char,
vlog In Varchar2,
vdate In Date,
vtype In Char Default 'S'
…………
# 400多行就不贴了
(五)
到此,基本排查完成,因客户的的应用程序由第三方负责开发运维,所以将以上信息反馈在沟通群中,并得到开发商反馈:这是一个临时的业务操作,已经转交给开发着手修改。