昨天中秋节,本该是团圆的好日子,苦逼的运维我还要值班(哈哈,吐槽一下)本以为会没有啥事,谁知道比较重要的一台Oracle服务器突然报警,CPU 2个core都飙到100%,load average也比较高,如下图:
AWS CloudWatch也可以看出来CPU长期使用率100%
从图可得:系统us比较高,sy基本可以忽略,Memory和IO都已经检查过,不存在瓶颈,根据以往经验,极有可能是Oracle数据库有SQL在长时间运行,并且没有释放,登录到数据库查看,可以看到sid为410,408,404进程执行的都是同一个SQL,
SYS@xxxxxx>SELECT b.sid oracleID, b.username, b.serial#, spid, paddr, b.machine, c.sql_text FROM v$process a, v$session b, v$sqlarea c WHERE a.addr = b.paddr AND b.sql_hash_value = c.hash_value; ORACLEID USERNAME SERIAL# SPID PADDR MACHINE ---------- ------------------------------ ---------- ------------------------ ---------------- ---------------------------------------------------------------- SQL_TEXT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 410 PRERNAP2 371 16743 00000002DEC84E60 Prernap2-mbr with cte as(select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM (Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate), cte1 as (select foreca stdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, TO_C HAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as (select cte.shipto, cte.buyerpartnumber,ct e.week_month, cte.week_year, max(rank_week) as max_week from cte group by cte.shipto, cte.buyerpartnumber,cte.week_month, cte.week_year), cte4 as (select cte2.*, cte1.forecast_month, cte1.forecast_y 408 PRERNAP21163 15129 00000002DEC916A0 Prernap2-mbr with cte as(select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM (Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate), cte1 as (select foreca stdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, TO_C HAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as (select cte.shipto, cte.buyerpartnumber,ct e.week_month, cte.week_year, max(rank_week) as max_week from cte group by cte.shipto, cte.buyerpartnumber,cte.week_month, cte.week_year), cte4 as (select cte2.*, cte1.forecast_month, cte1.forecast_y 18 PRERNAP2 311 19710 00000002DEC948B0 Prernap2-mbr with cte as(select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM (Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate), cte1 as (select foreca stdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, TO_C HAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as (select cte.shipto, cte.buyerpartnumber,ct e.week_month, cte.week_year, max(rank_week) as max_week from cte group by cte.shipto, cte.buyerpartnumber,cte.week_month, cte.week_year), cte4 as (select cte2.*, cte1.forecast_month, cte1.forecast_y 404 PRERNAP2 665 21911 00000002DEC95960 Prernap2-mbr with cte as( select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM (Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate ), cte1 as ( select for ecastdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, T O_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as ( select cte.shipto, cte.buyerpartnumbe r,cte.week_month, cte.week_year, max(rank_week) as max_week , cte1.forecast_month, cte1.forecast_year from cte inner join cte1 on cte1.forecast_year>=cte.week_year and cte.shipto = cte1.shipto and c 22 SYS 447 23888 00000002DEC96A10 ec2-admart-01 SELECT b.sid oracleID, b.username, b.serial#, spid,paddr, b.machine,c.sql_text FROM v$process a, v$session b, v$sqlarea c WHERE a.addr = b.paddrAND b.sq l_hash_value = c.hash_value 387 PRERNAP2 313 24261 00000002DEC97AC0 Prernap2-mbr with cte as( select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM (Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate ), cte1 as ( select for ecastdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, T O_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as ( select cte.shipto, cte.buyerpartnumbe r,cte.week_month, cte.week_year, max(rank_week) as max_week , cte1.forecast_month, cte1.forecast_year from cte inner join cte1 on cte1.forecast_year>=cte.week_year and cte.shipto = cte1.shipto and c 6 rows selected. SYS@xxxxxx>select b.sid,b.serial#,b.machine,b.terminal,b.program,b.process,b.status from v$lock a, v$session b where a.SID = b.SID and username is not null and username not in ('SYS','SYSTEM'); SID SERIAL# MACHINE TERMINAL PROGRAM PROCESS STATUS ---------- ---------- ---------------------------------------------------------------- ------------------------------ ------------------------------------------------ ------------------------ -------- 387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE 387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE 404 665 Prernap2-mbr unknown SQL Developer 4145 ACTIVE 387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE 387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE 408 1163 Prernap2-mbr unknown SQL Developer 3377 ACTIVE 387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE 387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE 404 665 Prernap2-mbr unknown SQL Developer 4145 ACTIVE 408 1163 Prernap2-mbr unknown SQL Developer 3377 ACTIVE 410 371 Prernap2-mbr unknown SQL Developer 5691 ACTIVE 18 311 Prernap2-mbr unknown SQL Developer 1497 ACTIVE 20 221 Prernap2-mbr unknown SQL Developer 4689 ACTIVE 20 221 Prernap2-mbr unknown SQL Developer 4689 ACTIVE 387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE 15 rows selected. SYS@xxxxxx>select sid, username, blocking_session from v$session where blocking_session is not null; SID USERNAME BLOCKING_SESSION ---------- ------------------------------ ---------------- 18 PRERNAP2 408 387 PRERNAP2 404 410 PRERNAP2 408 SYS@xxxxxx>select sid, serial#, username from v$session where sid='410'; SID SERIAL# USERNAME ---------- ---------- ------------------------------ 410 371 PRERNAP2
解决方法
找到开发人员,询问原因,得到的反馈是在测试几条SQL(我擦,竟然在生产环境测试SQL,哎,一点敬畏之心都没有,可怕!)
kill掉blocked的进程,释放资源,再这么跑下去,系统随时可能崩溃,最后去优化一下的SQL,再去执行
alter system kill session '410,371';
......其他几个进程同理干掉即可