【会话】Oracle kill session系列
1.1 BLOG文档结构图
1.3.3 oracle死连接(DC)和不活动会话(INACTIVE) - 15 -
2.3.1 设置用户profile的idle_time 参数 - 32 -
2.3.2 授予普通用户杀自己session的权限 - 35 -
2.3.3 kill session的时候加immediate和不加immediate的区别 - 38 -
1.2 前言部分
1.2.1 导读和注意事项
各位技术爱好者,看完本文后,你可以掌握如下的技能,也可以学到一些其它你所不知道的知识,~O(∩_∩)O~:
① killed状态的会话如何释放(如何找到后台进程)--重点?
② 授予普通用户可以kill自己用户session的权限
③ kill session和disconnect session的区别
④ v$session.CREATOR_ADDR列的使用
⑤ 对inactive会话的处理(1、sqlnet.ora文件中设置expire_time 参数 2、用户profile的idle_time 参数 3、找到很久没有响应的会话然后kill掉)
⑥ v$session.LAST_CALL_ET的使用
⑦ PMON的清理周期隐含参数"_PKT_PMON_INTERVAL"介绍
Tips:
① 本文在ITpub(http://blog.itpub.net/26736162)和博客园(http://www.cnblogs.com/lhrbest)有同步更新
② 文章中用到的所有代码,相关软件,相关资料,MOS资料请前往小麦苗的云盘下载(http://blog.itpub.net/26736162/viewspace-1624453/)
③ 若文章代码格式有错乱,推荐使用搜狗、360或QQ浏览器,也可以下载pdf格式的文档来查看,pdf文档下载地址:http://blog.itpub.net/26736162/viewspace-1624453/
④ 本篇BLOG中命令的输出部分需要特别关注的地方我都用灰色背景和粉红色字体来表示,比如下边的例子中,thread 1的最大归档日志号为33,thread 2的最大归档日志号为43是需要特别关注的地方;而命令一般使用黄色背景和红色字体标注;对代码或代码输出部分的注释一般采用蓝色字体表示。
List of Archived Logs in backup set 11
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- ------------------- ---------- ---------
1 32 1621589 2015-05-29 11:09:52 1625242 2015-05-29 11:15:48
1 33 1625242 2015-05-29 11:15:48 1625293 2015-05-29 11:15:58
2 42 1613951 2015-05-29 10:41:18 1625245 2015-05-29 11:15:49
2 43 1625245 2015-05-29 11:15:49 1625253 2015-05-29 11:15:53
[ZHLHRDB1:root]:/>lsvg -o
T_XDESK_APP1_vg
rootvg
[ZHLHRDB1:root]:/>
00:27:22 SQL> alter tablespace idxtbs read write;
====》2097152*512/1024/1024/1024=1G
本文如有错误或不完善的地方请大家多多指正,ITPUB留言或QQ皆可,您的批评指正是我写作的最大动力。
1.2.2 相关参考文章链接
disconnect session和kill session的区别:http://blog.itpub.net/26736162/viewspace-1979223/
1.2.3 本文简介
今天同事杀会话的时候采用了alter system kill session ‘xxx,xxx’的方式,结果杀完后,v$session中还可以查到,就求助我,因为我之前杀会话都是带的immediate的,杀完后会立刻释放,v$session中也查询不到,同事现在的情况就只能杀后台进程了,但paddr列关联不到后台进程,查了下MOS还是给出了一些办法,整理了一下,分享给大家。
1.3 相关知识点扫盲(摘自网络+个人总结)
1.3.1 得到当前会话的几个SQL
SELECT USERENV('SID') FROM DUAL;
SELECT SID FROM V$MYSTAT WHERE ROWNUM =1;
SELECT b.SID,
b.SERIAL#
FROM v$session b
WHERE b.SID = USERENV('SID');
SELECT a.SID,
b.SERIAL#
FROM v$mystat a,
v$session b
WHERE a.SID = b.SID
AND rownum = 1;
SELECT a.SID,
b.SERIAL# ,
c.SPID,
b.status
FROM v$mystat a,
v$session b ,
v$process c
WHERE a.SID = b.SID
and b.PADDR=c.ADDR
AND rownum = 1;
SELECT b.SID,
b.SERIAL# ,
c.SPID,
b.status
FROM v$session b ,
v$process c
WHERE b.PADDR=c.ADDR
AND b.sid=???;
1.3.2 Session 状态说明
Oracle session 有如下几种状态:
ACTIVE - Session currently executing SQL
INACTIVE
KILLED - Session marked to be killed
CACHED - Session temporarily cached for use by Oracle*XA
SNIPED - Session inactive, waiting on the client
有关状态的说明:
(1)active 处于此状态的会话,表示正在执行,处于活动状态。
官方文档说明:
Any session that is connected to the database and is waiting for an event that does not belong to the Idle wait class is considered as an active session.
(2)killed处于此状态的会话,被标注为删除,表示出现了错误,正在回滚。
当然,也是占用系统资源的。还有一点就是,killed的状态一般会持续较长时间,而且用windows下的工具pl/sql developer来kill掉,是不管用的,要用命令:alter system kill session 'sid,serial#' ;
(3)inactive 处于此状态的会话表示不是正在执行的
该状态处于等待操作(即等待需要执行的SQL语句),通常当DML语句已经完成。 但连接没有释放,这个可能是程序中没有释放,如果是使用中间件来连接的话,也可能是中间件的配置或者是bug 导致。inactive对数据库本身没有什么影响,但是如果程序没有及时commit,那么就会造成占用过多会话。容易是DB 的session 达到极限值。
一般不处理inactive 状态的session, 如果达到了session 的最大值, 就增加processes 和 sessions 参数。 对于Inactive 状态的session,可以设置过期时间:
(1)sqlnet.ora文件中设置expire_time 参数
(2)设置用户profile的idle_time 参数
(3)找到很久没有响应的会话然后kill掉
当设置了resource_limit=true 。通过idle_time限制session idle 时间。session idle超过设置时间,状态为sniped (v$session).,然而OS下的process并不会释放,当session(user process) 再次与server process 通讯,将关闭相应的server process.
sqlnet.expire_time 的原理不一样,Oracle Server 发送包探测dead connection ,如果连接关闭,或者不再用,则关闭相应的server process.
以上两者组合使用,减少server process,防止process超过init$ORACLE_SID极限值。
本文会对这3种方式分别说明的。
1.3.2.1 清理inactive会话的3种方式
一、 设置sqlnet.expire_time可以在sqlnet.ora文件里面加上sqlnet.expire_time这个参数来解决,设置一个分钟数,这是ORACLE建议的DCD解决方法。
在 sqlnet.ora文件中设置expire_time 参数
官网有关这个参数的说明:
http://download.oracle.com/docs/cd/B19306_01/network.102/b14213/sqlnet.htm
Oracle? Database Net Services Reference 11g Release 2 (11.2)E10835-10
SQLNET.EXPIRE_TIME
Purpose
Use parameter SQLNET.EXPIRE_TIME to specify a the time interval, in minutes, to send a probe to verify that client/server connections are active. Setting a value greater than 0 ensures that connections are not left open indefinitely, due to an abnormal client termination. If the probe finds a terminated connection, or a connection that is no longer in use, it returns an error, causing the server process to exit. This parameter is primarily intended for the database server, which typically handles multiple connections at any one time.
sqlnet.expire_time 的原理:Oracle Server 发送包探测dead connection ,如果连接关闭,或者不再用,则关闭相应的server process.
Limitations on using this terminated connection detection feature are:
(1)It is not allowed on bequeathed connections.
(2)Though very small, a probe packet generates additional traffic that may downgrade network performance.
(3)Depending on which operating system is in use, the server may need to perform additional processing to distinguish the connection probing event from other events that occur. This can also result in degraded network performance.
Default :0
Minimum Value :0
Recommended Value :10
Example
SQLNET.EXPIRE_TIME=10
二、 设置用户profile的idle_time 参数
Oracle 用户 profile 属性
http://blog.csdn.net/tianlesoftware/archive/2011/03/10/6238279.aspx
注意,要启用idle_time 要先启用RESOURCE_LIMIT参数。 该参数默认是False。 官网说明如下:
Property |
Description |
Parameter type |
Boolean |
Default value |
false |
Modifiable |
ALTER SYSTEM |
Range of values |
true | false |
RESOURCE_LIMIT determines whether resource limits are enforced in database profiles.
Values:
TRUE:Enables the enforcement of resource limits
FALSE:Disables the enforcement of resource limits
三、 找到很久没有响应的会话然后kill 掉
关于v$session中LAST_CALL_ET列的理解:http://blog.itpub.net/26736162/viewspace-1762403/
根据v$session中LAST_CALL_ET列的意义我们可以写出如下的SQL脚本,没一个小时清理10个小时没有响应的会话:
set sqlblanklines on
CREATE OR REPLACE PROCEDURE P_kill_session_LHR AS
-----------------------------------------------------------------------------------
-- Created on 2013-06-25 12:05:07 by lhr
--Changed on 2015-08-05 12:05:07 by lhr
-- function: 杀掉10个小时之前的会话 ,告警日志中会记录被杀掉的会话信息
-----------------------------------------------------------------------------------
BEGIN
-- IF to_char(SYSDATE, 'HH24') >= '20' OR
-- TO_CHAR(SYSDATE, 'HH24') <= '08' THEN
FOR cur IN (SELECT A.USERNAME,
A.LOGON_TIME,
A.STATUS,
A.SID,
A.SERIAL#,
A.MACHINE,
A.OSUSER,
'ALTER SYSTEM DISCONNECT SESSION ''' || a.SID || ',' ||
a.serial# || ''' IMMEDIATE' kill_session
FROM v$session A
WHERE A.STATUS IN ('INACTIVE')
AND A.USERNAME IS NOT NULL
AND A.LAST_CALL_ET >= 60 * 60 * 10) LOOP
BEGIN
EXECUTE IMMEDIATE cur.kill_session;
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
END LOOP;
-- END IF;
EXCEPTION
WHEN OTHERS THEN
NULL;
END P_kill_session_LHR;
/
BEGIN
--DBMS_SCHEDULER.drop_job('JOB_P_kill_session_LHR');
DBMS_SCHEDULER.CREATE_JOB(JOB_NAME => 'JOB_P_kill_session_LHR',
JOB_TYPE => 'STORED_PROCEDURE',
JOB_ACTION => 'P_kill_session_LHR',
repeat_interval => 'FREQ=MINUTELY;INTERVAL=60',
ENABLED => TRUE,
START_DATE => SYSDATE,
COMMENTS => '删除--每60分钟检查一次');
END;
/
SELECT d.job_name,d.job_action,d.run_count FROM dba_scheduler_jobs d WHERE d.job_name='JOB_P_KILL_SESSION_LHR';
SQL> col JOB_ACTION format a20
SQL> SELECT d.job_name,d.job_action,d.run_count FROM dba_scheduler_jobs d WHERE d.job_name='JOB_P_KILL_SESSION_LHR';
JOB_NAME JOB_ACTION RUN_COUNT
------------------------------ -------------------- ----------
JOB_P_KILL_SESSION_LHR P_kill_session_LHR 4
运行日志:
SELECT * FROM dba_scheduler_job_run_details d WHERE d.job_name='JOB_P_KILL_SESSION_LHR';
1.3.3 oracle死连接(DC)和不活动会话(INACTIVE)
This note explains the difference between a dead connection and an INACTIVE session in v$session. It also discusses the mechanisms provided to automate the cleanup of each.
这里解释死连接和不活动会话的区别,也会讨论自动清除的机制。
Difference between INACTIVE sessions and Dead Connections
---------------------------------------------------------
Dead connections and INACTIVE sessions are different issues. Oracle provides separate mechanisms to automate the cleanup of each.
死连接和不活动会话的不同问题,oracle提供了分离的机制去自动清理。
(1) Dead connections:死连接
These are previously valid connections with the database but the connection between the client and server processes has terminated abnormally.
有些之前合法的连接,但是由于客户端和服务器进程的异常中断。
Examples of a dead connection:
- A user reboots/turns-off their machine without logging off or disconnecting from the database.
- A network problem prevents communication between the client and the server.
1.用户没有注销就关闭机器。
2.网络在客户端和服务器端终端连接。
In these cases, the shadow process running on the server and the session in the database may not terminate. To automate the cleanup of these sessions, you can use the Dead Connection Detection (DCD) feature of Net8.
这种情况下,后台进程跑在服务器端,而会话在数据库端不会中断。
When DCD is enabled, Net8 (server-side) sends a packet to the client. If the client is active, the packet is discarded. If the client has terminated, the server will receive an error and Net8 (server-side) will end that session.
当死连接启动,NET8(服务端)会发送一个包到客户端。如果客户端是活动的,这个包就被丢掉。如果客户端已经被中断,服务器端将接收一个错误,将会中断该会话。SQLNET.EXPIRE_TIME.
Refer to Note:151972.1: Dead Connection Detection (DCD) Explained, for details regarding DCD.
(2) INACTIVE Sessions:
These are sessions that remain connected to the database with a status in v$session of INACTIVE.
会话与服务器端保持连接,但是状态为inactive.
Example of an INACTIVE session:
- A user starts a program/session, then leaves it running and idle for an extended period of time.
用户开始一个会话,运行一段时间后,保持相当一段时间的空闲。
To automate cleanup of INACTIVE sessions you can create a profile with an appropriate IDLE_TIME setting and assign that profile to the users.
自动清理不活动的会话,你可以创一个profile.然后设置恰当的IDLE_TIME,分配给指定用户。
Note:159978.1: How To Automate Disconnection of Idle Sessions, outlines the steps to setup IDLE_TIME for this.
1.3.4 kill session相关内容
kill session 是DBA经常碰到的事情之一。如果kill 掉了不该kill 的session,则具有破坏性,因此尽可能的避免这样的错误发生。同时也应当注意,如果kill 的session属于Oracle 后台进程,则容易导致数据库实例宕机。
一、获得需要kill session的信息
SET LINESIZE 180
COLUMN spid FORMAT A10
COLUMN username FORMAT A10
COLUMN program FORMAT A40
SELECT s.inst_id,
s.sid,
s.serial#,
p.spid,
s.username,
s.program,
s.paddr,
s.STATUS
FROM gv$session s
JOIN gv$process p
ON p.addr = s.paddr
AND p.inst_id = s.inst_id
WHERE s.type != 'BACKGROUND';
INST_ID SID SERIAL# SPID USERNAME PROGRAM PADDR STATUS
---------- ---------- ---------- ---------- ---------- ---------------------------------------- ---------------- --------
1 125 5 14029 SYS sqlplus@orcltest (TNS V1-V3) 0000000077C98660 INACTIVE
1 9 15 14274 SYS sqlplus@orcltest (TNS V1-V3) 0000000077C99710 INACTIVE
1 17 5 14078 LHR sqlplus.exe 0000000077CA5F50 INACTIVE
1 144 31 14645 SYS sqlplus@orcltest (TNS V1-V3) 0000000077CA7000 ACTIVE
1 20 7 14647 SYS plsqldev32.exe 0000000077CA80B0 INACTIVE
1 145 23 14651 SYS plsqldev32.exe 0000000077CA9160 INACTIVE
二、使用ALTER SYSTEM KILL SESSION 命令实现
语法:
SQL> ALTER SYSTEM KILL SESSION 'sid,serial#';
SQL> ALTER SYSTEM KILL SESSION 'sid,serial#' IMMEDIATE;
对于RAC环境下的kill session ,需要搞清楚需要kill 的session 位于哪个节点,可以查询GV$SESSION视图获得。11g杀掉集群环境下的某个会话:
alter system kill session'1228,42549,@实例号';
例如:alter system kill session '1228, 42549, @2';
10g下应登录到某个特定的实例才可以。
Kill session 命令实际不会kill session,比如等待远程数据库的反应或者回滚事务,那么session 就不会立即kill其必须等待当前的操作结束才能执行,在这种情况下,session 就会被标记为killed 状态。
我们可以在kill 命令中添加immediate,语法如下:
SQL> ALTER SYSTEM KILL SESSION 'sid,serial#' IMMEDIATE;
这个命令不会影响性能,但它会立即返回到当前的session,处理kill操作,而不是等待其他的信息完成。 如果session 一直处于killed 状态,那么可以考虑在操作系统级别kill掉相关的进程。不过在操作之前,要先确认session 是否在执行rollback 操作。 可以使用如下SQL 来确认。
SET LINESIZE 200
COLUMN username FORMAT A15
SELECT s.username,
s.sid,
s.serial#,
t.used_ublk,
t.used_urec,
rs.segment_name,
r.rssize,
r.status
FROM v$transaction t,
v$session s,
v$rollstat r,
dba_rollback_segs rs
WHERE s.saddr = t.ses_addr
AND t.xidusn = r.usn
AND rs.segment_id = t.xidusn
ORDER BY t.used_ublk DESC;
如果有我们的session,那么就要等rollback 先完成,然后才能在操作系统级别kill session。
kill session 的时候仅仅是将会话杀掉。在有些时候,由于较大的事务或需要运行较长的SQL语句将导致需要kill的session并不能立即杀掉。对于这种情况将收到 "marked for kill"提示(如下),一旦会话当前事务或操作完成,该会话被立即杀掉。
alter system kill session '4730,39171'
*
ERROR at line 1:
ORA-00031: session marked for kill
在下面的操作中将杀掉会话146,144
sys@AUSTIN> alter system kill session '146,23';
System altered.
sys@AUSTIN> alter system kill session '144,42';
System altered.
sys@AUSTIN> select inst_id,saddr,sid,serial#,paddr,username,status,program from gv$session where username is not null;
INST_ID SADDR SID SERIAL# PADDR USERNAME STATUS PROGRAM
---------- -------- ---------- ---------- -------- ---------- -------- ---------------------------------------------
1 4C70BF04 144 42 4C6545A0 SCOTT KILLED sqlplus@oracle10g (TNS V1-V3)
1 4C70E6B4 146 23 4C6545A0 TEST KILLED sqlplus@oracle10g (TNS V1-V3)
1 4C71FC84 160 17 4C624174 SYS ACTIVE sqlplus@oracle10g (TNS V1-V3)
SQL> select sid,serial#,server,status from v$session where sid=22;
SID SERIAL# SERVER STATUS
---------- ---------- --------- --------
22 7 PSEUDO KILLED
注意:在查询中可以看到被杀掉的会话的PADDR地址发生了变化,参照查询结果中的红色字体。如果多个session被kill 掉,则多个session的PADDR被改为相同的进程地址,被杀掉的会话的server列变为PSEUDO。
1.3.4.1 oracle killed会话不释放的问题
一般情况下,在杀一个会话的时候,直接执行alter system kill session ‘sid,serial#’;
Administrator's Guide说,当session是active的时候,alter system kill session 只是将session的状态标识为killed,server变为pseudo状态,并不会释放session持有的资源,所以我们在执行完alter system kill session 后,看会话还是一直存在。
这种情况下可以使用 immediate选项,强制立即Kill会话,如下:
SQL> alter system kill session '3964,51752' immediate;
SQL Language Reference(http://docs.oracle.com/cd/B28359_01/server.111/b28286/toc.htm#BEGIN)
里对Immediate的解释是:IMMEDIATE Specify IMMEDIATE to instruct Oracle
Database to roll back ongoing transactions, release all session locks, recover the entire session state,and return control to you immediately.
1.3.4.2 killed状态的会话如何找到spid列
一般情况下我们查询会话的后台进程是通过如下的SQL,即通过v$session的paddr列关联v$process的addr列,但是killed状态的v$session的paddr列都变成了一样的,所以已经没有办法通过如下的SQL去查询了。
SELECT b.SID,
b.SERIAL# ,
c.SPID,
b.status
FROM v$session b ,
v$process c
WHERE b.PADDR=c.ADDR
AND b.sid=???;
如果会话已经在v$sesion里是killed状态,那么我们通过上面的SQL已经查不出spid,可以用下面的SQL查出SPID:
-----方法1
select spid, program from v$process
where program!= 'PSEUDO'
and addr not in (select paddr from v$session)
and addr not in (select paddr from v$bgprocess)
and addr not in (select paddr from v$shared_server);
select INST_ID, spid, program,'kill -9 '|| spid kill9
from gv$process a
where program != 'PSEUDO'
and (INST_ID, addr) not in (select INST_ID, paddr from gv$session)
and (INST_ID, addr) not in (select INST_ID, paddr from gv$bgprocess)
and (INST_ID, addr) not in (select INST_ID, paddr from gv$shared_server)
and a.PNAME is null;
-----方法2 我自己常用的方法
set line 9999
col sessionid format a20
col sessionid_killed format a20
col kill_session format a60
SELECT a.INST_ID,
a.SID || ',' || a.SERIAL# || ',' ||
(select spid
from gv$process b
where b.INST_ID = a.INST_ID
and A.creator_addr = b.ADDR --and decode(a.status,'KILLED',A.creator_addr,A.PADDR) = b.ADDR
) sessionid,
a.PADDR,
a.STATUS,
a.PROGRAM,
'alter system disconnect session ''' || sid || ',' || serial# || ''' immediate;' kill_session
FROM gv$session a
WHERE a.USERNAME = 'SYS'
and a.STATUS = 'KILLED';
As a result of the bug, 2 additional columns have been added to V$SESSION from 11g on:
V$SESSION
CREATOR_ADDR - state object address of creating process
CREATOR_SERIAL# - serial number of creating process
CREATOR_ADDR is the column that can be joined with the ADDR column in V$PROCESS to uniquely identify the killed process corresponding to the former session.
Following the previous example, this would identify the killed session
-----方法3
SELECT a.SID || ',' || a.SERIAL# || ',' ||
(select spid
from gv$process b
where b.INST_ID = a.INST_ID
and A.pid = b.pid) sessionid,
'alter system kill session ''' || sid || ',' || serial# || ''';' kill_session
FROM gV$DETACHED_SESSION a;
-----方法4
SELECT INST_ID, spid, program, 'kill -9 ' || spid kill9
FROM gv$process a
WHERE (a.INST_ID, a.addr) in (select INST_ID, p.addr
from gv$process p
where pid <> 1
minus
select INST_ID, s.paddr
from gv$session s)
and a.PNAME is null;
-----方法4
SELECT s.SID, s.username,s.status,
x.ADDR,x.KSLLAPSC,x.KSLLAPSN,x.KSLLASPO,x.KSLLID1R,x.KSLLRTYP,
decode(bitand(x.ksuprflg,2),0,null,1)
FROM x$ksupr x,v$session s
WHERE s.paddr(+)=x.addr
and bitand(ksspaflg,1)!=0
;
1.3.4.3 为何killed状态的进程一直在v$session中能查询到?
[ID 100859.1]这篇文章中提到pmon如何清理killed的会话:
PMON will not delete the session object itself until the client connected to
that session notices that it has been killed. Therefore, the sequence of
events is:
1) alter system kill session is issued - the STATUS of the session object in
V$SESSION becomes KILLED, its server becomes PSEUDO.
2) PMON cleans up the *resources* allocated to the session
(i.e., rolls back its transaction, releases its locks, etc).
3) the entry in V$SESSION remains there until the client of that session (the
client is the process associated with the OSUSER,MACHINE,PROCESS columns in
the V$SESSION view) tries to do another request.
4) the client attempts another SQL statement and gets back ORA-28.
5) PMON can now remove the entry from V$SESSION.
This behavior is necessary because the client still has pointers to the
session object even though the session has been killed. Therefore, the
object cannot be deleted until the client is no longer pointing at it.
alter system kill session后,会话状态变为killed,pmon回收资源后,会话的信息仍然保留在v$session中,直到客户端再次请求,数据库返回ORA-28错误:"your session has been killed"。
此时pmon才从v$session中移除这些会话的信息。而上述中,客户端机器重启,无法再向数据库发送请求,因此从v$session中一直可以查到,只有通过操作系统强制杀掉进程,才能触发pmon从v$session中清除。
然后Oracle就等待PMON去清除这些Session.所以通常等待一个被标记为Killed的Session退出需要花费很长的时间.如果此时被Kill的process,重新尝试执行任务,那么马上会收到进程中断的提示,process退出,此时Oracle会立即启动PMON来清除该session.这被作为一次异常中断处理.
1.3.4.4 为何kill session后,paddr变成同一个值了?
[ID 387077.1]这篇文章中提到:
Bug 5453737 WHEN A SESSION IS KILLED, PADDR CHANGES IN V$SESSION BUT ADDR NOT IN V$PROCESS closed as not a bug with the following explanation:
When a session is killed, the session state object(and all the child state objects under the session state object) move out from under the original parent process state object, and are placed under the pseudo process state object (which is expected, given the parent/child process mechanism on Unix). PMON will clean up all the state objects found under the pseudo process state object. That explains why PADDR changes in V$SESSION when a session is killed. New PADDR you are seeing in v$SESSION is the address of the pseudo process state object. This shows up in system state under PSEUDO PROCESS for group DEFAULT: V$PROCESS still maintains the record of the original parent process. This is expected.
使用以下sql查找killed状态的spid:
select spid, program from v$process
where program!= 'PSEUDO'
and addr not in (select paddr from v$session)
and addr not in (select paddr from v$bgprocess)
and addr not in (select paddr from v$shared_server);
通过底层表x$ksupr关联v$session也可以,11.1.0.6版本及更高在v$session增加了2个字段CREATOR_ADDR和CREATOR_SERIAL#用于标记这种情况。然后就可以通过操作系统命令kill -9来杀掉这些进程了。
1.3.4.5 ALTER SYSTEM KILL SESSION 权限不足
需要赋权限,dba角色不用,grant alter system to lhr;
1.1.1.1 授予普通用户杀自己session的权限
若没有dba角色或者没有alter system权限的话,业务用户就不能自己杀自己会话了,这样一来,只能dba来介入了,其实这个功能我们可以通过如下的SQL脚本来完成。
------- user replace XXXXXX
------- 普通用户可以kill自己的session 授予应用用户杀自己session的权限
create or replace view vw_myownersession_lhr
as
select * from v$session where username = USER;
create or replace public synonym syn_myownersession_lhr for sys.vw_myownersession_lhr;
create or replace procedure pro_kill_myown_session_lhr( p_sid in varchar2,p_serial# in varchar2)
is
cursor_name pls_integer default dbms_sql.open_cursor;
ignore pls_integer;
BEGIN
select count(*) into ignore
from v$session
where username = USER
and sid = p_sid
and serial# = p_serial# ;
if ( ignore = 1 )
then
dbms_sql.parse(cursor_name,'alter system disconnect session '''||p_sid||','||p_serial#||''' immediate',dbms_sql.native);
ignore := dbms_sql.execute(cursor_name);
else
raise_application_error( -20001,'You do not own session ''' ||p_sid || ',' || p_serial# ||'''' );
end if;
END pro_kill_myown_session_lhr;
/
create or replace public synonym pro_kill_session_lhr for sys.pro_kill_myown_session_lhr;
grant select on syn_myownersession_lhr to XXXXXX;
grant execute on pro_kill_session_lhr to XXXXXX;
SELECT USERENV('SID') FROM DUAL;
select sid,serial#,paddr,status from syn_myownersession_lhr where sid=19;
exec pro_kill_session_lhr(19,15);
1.1.1.2 ALTER SYSTEM DISCONNECT SESSION 说明
Alter system disconnect session 是一个可选的kill session 的方法。 与kill session 命令不同,disconnect session 命令会kill 掉 dedicated server process, 该命令等同于在操作系统级别kill 掉server process。
具体语法如下:
SQL> ALTER SYSTEM DISCONNECT SESSION'sid,serial#' POST_TRANSACTION;
SQL> ALTER SYSTEM DISCONNECT SESSION'sid,serial#' IMMEDIATE;
POST_TRANSACTION 选项会等待事务完成之后在断开连接。
IMMEDIATE 选项会立即断开连接,然后事务会进行recover操作。
这2个选项也可以一起使用,但是必须指定其中一个,否则就会报错:
SQL> alter system disconnect session'30,7';
alter system disconnect session '30,7'
*
ERROR at line 1:
ORA-02000: missing POST_TRANSACTION orIMMEDIATE keyword
SQL>
SQL> alter system disconnect session'15,12' post_transaction immediate;
System altered.
使用alter system disconnectsession 命令就不需要切换到系统来kill session,也从而减少了kill 错进程的几率。
1.1.1.3 PMON 清理间隔
PMON进程负责处理异常结束进程相关资源的释放。PMON周期性地被唤醒,可以对"_PKT_PMON_INTERVAL"这个隐藏参数来进行修改,默认为50秒。也可以通过查找出进程的PID,然后在oradebug中,执行命令oradebug wakeup orapid(oracle进程的PID,不是OS的PID)来手动唤醒PMON进程。可以用alter session set events '100246 trace name conext forever,level 4'来查看PMON的相关操作。
SYS@LHRDGZK1> SELECT a.INDX,
2 a.KSPPINM NAME,
3 a.KSPPDESC,
4 b.KSPPSTVL
5 FROM x$ksppi a,
6 x$ksppcv b
7 WHERE a.INDX = b.INDX
8 and lower(a.KSPPINM) like lower('%?meter%');
Enter value for parameter: _PKT_PMON_INTERVAL
old 8: and lower(a.KSPPINM) like lower('%?meter%')
new 8: and lower(a.KSPPINM) like lower('%_PKT_PMON_INTERVAL%')
INDX NAME KSPPDESC KSPPSTVL
----- ---------------------- ------------------------------------ --------------------
61 _pkt_pmon_interval PMON process clean-up interval (cs) 50
alter system set "_PKT_PMON_INTERVAL"=5;
1.1.1 MOS上的一些资料
于是metalink和google到以下一些资料:
Removing Sessions in Killed Status on Unix [ID 274216.1]
ALTER SYSTEM KILL Session Marked for Killed Forever [ID 1020720.102]
KILLING INACTIVE SESSIONS DOES NOT REMOVE SESSION ROW FROM V$SESSION [ID 1041427.6]
ALTER SYSTEM KILL SESSION does not Release Locks Killing a Thread on Windows NT [ID 100859.1]
http://blog.csdn.net/tianlesoftware/article/details/7417058
http://www.eygle.com/faq/Kill_Session.htm
---比较有用的
How To Find The Process Identifier (pid, spid) After The Corresponding Session Is Killed? [ID 387077.1]
HOW TO HAVE ORACLE CLEAN-UP OLD USER INFO AFTER KILLING SESSION UNDER MTS Note:1023442.6
---------------------------------------------------------------------------------------------------------------------
第二章 实验部分
二.1 实验环境介绍
项目 |
primary db |
db 类型 |
单实例 |
db version |
11.2.0.3.0 |
db 存储 |
ASM |
主机IP地址/hosts配置 |
192.168.59.129 |
OS版本及kernel版本 |
rhel 6.5 |
二.2 实验内容
实验序号 |
实验内容 |
1 |
设置用户profile的idle_time 参数 |
2 |
kill session的时候加immediate和不加的区别 |
3 |
授予普通用户kill自己用户的权限 |
4 |
KILLED状态的会话如何找到相关的SPID |
二.3 实验过程
二.3.1 设置用户profile的idle_time 参数
可以参考:
http://blog.csdn.net/leshami/article/details/9184917
http://blog.csdn.net/tianlesoftware/article/details/6238279
设置resource_limit 为true,该参数默认为false。
[oracle@orcltest ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Sun Jun 26 17:55:53 2016
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> show parameter RESOURCE_LIMIT
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
resource_limit boolean FALSE
SQL> alter system set resource_limit=true;
System altered.
为业务用户创建profile文件。然后把该配置文件赋给业务用户。
SQL> create profile pro_lhr limit idle_time 1;
Profile created.
SQL> alter user lhr profile pro_lhr;
User altered.
SQL>
SQL> set line 9999 pagesize 9999
SQL> select * from dba_profiles where profile='PRO_LHR';
PROFILE RESOURCE_NAME RESOURCE LIMIT
------------------------------ -------------------------------- -------- ----------------------------------------
PRO_LHR COMPOSITE_LIMIT KERNEL DEFAULT
PRO_LHR SESSIONS_PER_USER KERNEL DEFAULT
PRO_LHR CPU_PER_SESSION KERNEL DEFAULT
PRO_LHR CPU_PER_CALL KERNEL DEFAULT
PRO_LHR LOGICAL_READS_PER_SESSION KERNEL DEFAULT
PRO_LHR LOGICAL_READS_PER_CALL KERNEL DEFAULT
PRO_LHR IDLE_TIME KERNEL 1
PRO_LHR CONNECT_TIME KERNEL DEFAULT
PRO_LHR PRIVATE_SGA KERNEL DEFAULT
PRO_LHR FAILED_LOGIN_ATTEMPTS PASSWORD DEFAULT
PRO_LHR PASSWORD_LIFE_TIME PASSWORD DEFAULT
PRO_LHR PASSWORD_REUSE_TIME PASSWORD DEFAULT
PRO_LHR PASSWORD_REUSE_MAX PASSWORD DEFAULT
PRO_LHR PASSWORD_VERIFY_FUNCTION PASSWORD DEFAULT
PRO_LHR PASSWORD_LOCK_TIME PASSWORD DEFAULT
PRO_LHR PASSWORD_GRACE_TIME PASSWORD DEFAULT
16 rows selected.
SQL>
启动一个会话,等待一分钟
D:\Users\xiaomaimiao>sqlplus lhr/lhr@192.168.59.129/oratest
SQL*Plus: Release 11.2.0.1.0 Production on Sun Jun 26 18:03:09 2016
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
LHR@192.168.59.129/oratest> set time on
18:03:12 LHR@192.168.59.129/oratest> SELECT a.SID,
18:03:37 2 b.SERIAL# ,
18:03:37 3 c.SPID,
18:03:37 4 b.status
18:03:37 5 FROM v$mystat a,
18:03:37 6 v$session b ,
18:03:37 7 v$process c
18:03:37 8 WHERE a.SID = b.SID
18:03:37 9 and b.PADDR=c.ADDR
18:03:37 10 AND rownum = 1;
SID SERIAL# SPID STATUS
---------- ---------- ------------------------ --------
19 9 14689 ACTIVE
1分钟后在其它会话窗口查询:
SQL> SELECT b.SID,
2 b.SERIAL# ,
3 c.SPID,
4 b.status
5 FROM v$session b ,
6 v$process c
7 WHERE b.PADDR=c.ADDR
8 AND b.sid=19;
SID SERIAL# SPID STATUS
---------- ---------- ------------------------ --------
19 9 14689 INACTIVE
SQL> /
SID SERIAL# SPID STATUS
---------- ---------- ------------------------ --------
19 9 14689 SNIPED
SQL>
SQL> alter system kill session '19,9' immediate;
System altered.
SQL> SELECT b.SID,
2 b.SERIAL# ,
3 c.SPID,
4 b.status
5 FROM v$session b ,
6 v$process c
7 WHERE b.PADDR=c.ADDR
8 AND b.sid=19;
no rows selected
SQL>
二.3.2 授予普通用户杀自己session的权限
创建普通用户lhrtest,授予resource和connect权限。
SQL> create user lhrtest identified by lhrtest;
User created.
SQL> grant resource ,connect to lhrtest;
Grant succeeded.
使用sys用户来创建需要的脚本:
SQL> create or replace view vw_myownersession_lhr
2 as
3 select * from v$session where username = USER;
View created.
SQL> create or replace public synonym syn_myownersession_lhr for sys.vw_myownersession_lhr;
Synonym created.
SQL> create or replace procedure pro_kill_myown_session_lhr( p_sid in varchar2,p_serial# in varchar2)
2 is
3 cursor_name pls_integer default dbms_sql.open_cursor;
4 ignore pls_integer;
5 BEGIN
6 select count(*) into ignore
7 from v$session
8 where username = USER
9 and sid = p_sid
10 and serial# = p_serial# ;
11
12 if ( ignore = 1 )
13 then
14 dbms_sql.parse(cursor_name,'alter system disconnect session '''||p_sid||','||p_serial#||''' immediate',dbms_sql.native);
15 ignore := dbms_sql.execute(cursor_name);
16 else
17 raise_application_error( -20001,'You do not own session ''' ||p_sid || ',' || p_serial# ||'''' );
18 end if;
19 END pro_kill_myown_session_lhr;
20 /
Procedure created.
SQL> create or replace public synonym pro_kill_session_lhr for sys.pro_kill_myown_session_lhr;
Synonym created.
SQL> grant select on syn_myownersession_lhr to lhrtest;
Grant succeeded.
SQL> grant execute on pro_kill_session_lhr to lhrtest;
Grant succeeded.
SQL>
用windows客户端登录一个会话:
D:\Users\xiaomaimiao>sqlplus lhrtest/lhrtest@192.168.59.129/oratest
SQL*Plus: Release 11.2.0.1.0 Production on Sun Jun 26 19:19:42 2016
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
LHRTEST@192.168.59.129/oratest> SELECT USERENV('SID') FROM DUAL;
USERENV('SID')
--------------
19
LHRTEST@192.168.59.129/oratest> select sid,serial#,paddr,status from syn_myownersession_lhr where sid=19;
SID SERIAL# PADDR STATUS
---------- ---------- ---------------- --------
19 15 0000000077C9B870 ACTIVE
不要关闭19,15窗口,然后重新开一个会话窗口:
SQL> conn lhrtest/lhrtest
Connected.
SQL> show user
USER is "LHRTEST"
SQL> select sid,serial#,paddr,status from syn_myownersession_lhr where sid=19;
SID SERIAL# PADDR STATUS
---------- ---------- ---------------- --------
19 15 0000000077C9B870 INACTIVE
SQL> alter system kill session '19,15' immediate;
alter system kill session '19,15' immediate
*
ERROR at line 1:
ORA-01031: insufficient privileges
SQL> exec pro_kill_session_lhr(19,15);
PL/SQL procedure successfully completed.
SQL> select sid,serial#,paddr,status from syn_myownersession_lhr where sid=19;
no rows selected
SQL>
可以看到普通用户也可以杀掉自己用户的会话了。
二.3.3 kill session的时候加immediate和不加immediate的区别
kill session 的测试:
SQL> set line 9999
SQL> col sessionid format a20
SQL> col sessionid_killed format a20
SQL> col kill_session format a60
SQL>
SQL> SELECT a.INST_ID,
2 a.SID || ',' || a.SERIAL# || ',' ||
3 (select spid
4 from gv$process b
5 where b.INST_ID = a.INST_ID
6 and A.creator_addr = b.ADDR --and decode(a.status,'KILLED',A.creator_addr,A.PADDR) = b.ADDR
7 ) sessionid,
8 a.PADDR,
9 a.STATUS,
10 a.PROGRAM,
11 a.server,
12 'alter system disconnect session ''' || sid || ',' || serial# || ''' immediate;' kill_session
13 FROM gv$session a
14 WHERE a.type != 'BACKGROUND';
INST_ID SESSIONID PADDR STATUS PROGRAM SERVER KILL_SESSION
---------- -------------------- ---------------- -------- ---------------------------------------- --------- ------------------------------------------------------------
1 9,169,14901 0000000077C9B870 INACTIVE sqlplus.exe DEDICATED alter system disconnect session '9,169' immediate;
1 20,9,14891 0000000077CA5F50 INACTIVE plsqldev32.exe DEDICATED alter system disconnect session '20,9' immediate;
1 23,35,14885 0000000077C99710 INACTIVE sqlplus.exe DEDICATED alter system disconnect session '23,35' immediate;
1 125,7,14873 0000000077C98660 ACTIVE sqlplus@orcltest (TNS V1-V3) DEDICATED alter system disconnect session '125,7' immediate;
1 145,23,14651 0000000077CA9160 INACTIVE plsqldev32.exe DEDICATED alter system disconnect session '145,23' immediate;
SQL> alter system kill session '9,169';
System altered.
SQL> alter system kill session '20,9';
System altered.
SQL> set line 9999
SQL> col sessionid format a20
SQL> col sessionid_killed format a20
SQL> col kill_session format a60
SQL>
SQL> SELECT a.INST_ID,
2 a.SID || ',' || a.SERIAL# || ',' ||
3 (select spid
4 from gv$process b
5 where b.INST_ID = a.INST_ID
6 and A.creator_addr = b.ADDR --and decode(a.status,'KILLED',A.creator_addr,A.PADDR) = b.ADDR
7 ) sessionid,
8 a.PADDR,
9 a.STATUS,
10 a.PROGRAM,
11 a.server,
12 'alter system disconnect session ''' || sid || ',' || serial# || ''' immediate;' kill_session
13 FROM gv$session a
14 WHERE a.type != 'BACKGROUND';
INST_ID SESSIONID PADDR STATUS PROGRAM SERVER KILL_SESSION
---------- -------------------- ---------------- -------- ---------------------------------------- --------- ------------------------------------------------------------
1 9,169,14901 0000000077D2DCF8 KILLED sqlplus.exe PSEUDO alter system disconnect session '9,169' immediate;
1 20,9,14891 0000000077D2DCF8 KILLED plsqldev32.exe PSEUDO alter system disconnect session '20,9' immediate;
1 23,35,14885 0000000077C99710 INACTIVE sqlplus.exe DEDICATED alter system disconnect session '23,35' immediate;
1 125,7,14873 0000000077C98660 ACTIVE sqlplus@orcltest (TNS V1-V3) DEDICATED alter system disconnect session '125,7' immediate;
1 145,23,14651 0000000077CA9160 INACTIVE plsqldev32.exe DEDICATED alter system disconnect session '145,23' immediate;
SQL>
可以看到,杀掉的2个会话在v$session中都可以查到,只是①其status变为了KILLED,②server列变为了PSEUDO,③paddr列都变为了一样的了,因此这个时候如若还使用paddr列来关联process列必然查询不到spid号了,这个时候可以使用前边我们介绍的几种SQL来查询会话的spid,然后用kill -9杀掉会话。
我们用kill -9杀掉会话,等待后台PMON来自动清理进程,若是PMON很慢,我们可以手动来用oradebug wakeup 2来唤醒PMON进程,还可以设置PMON的清理间隔alter system set "_PKT_PMON_INTERVAL"=5;:
[oracle@orcltest ~]$ ps -ef|grep 14901
oracle 14901 1 0 19:51 ? 00:00:00 oracleoratest (LOCAL=NO)
oracle 14959 14625 0 20:17 pts/8 00:00:00 grep 14901
[oracle@orcltest ~]$ kill -9 14901
[oracle@orcltest ~]$ ps -ef|grep 14901
oracle 14961 14625 0 20:17 pts/8 00:00:00 grep 14901
[oracle@orcltest ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Sun Jun 26 20:17:54 2016
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> set line 9999
SQL> col sessionid format a20
SQL> col sessionid_killed format a20
SQL> col kill_session format a60
SQL>
SQL> SELECT a.INST_ID,
2 a.SID || ',' || a.SERIAL# || ',' ||
3 (select spid
4 from gv$process b
5 where b.INST_ID = a.INST_ID
6 and A.creator_addr = b.ADDR --and decode(a.status,'KILLED',A.creator_addr,A.PADDR) = b.ADDR
7 ) sessionid,
8 a.PADDR,
9 a.STATUS,
10 a.PROGRAM,
11 a.server,
12 'alter system disconnect session ''' || sid || ',' || serial# || ''' immediate;' kill_session
13 FROM gv$session a
14 WHERE a.type != 'BACKGROUND';
INST_ID SESSIONID PADDR STATUS PROGRAM SERVER KILL_SESSION
---------- -------------------- ---------------- -------- ------------------------------------------------ ----------- ---------------------------------------------------------
1 9,169,14901 0000000077D2DCF8 KILLED sqlplus.exe PSEUDO alter system disconnect session '9,169' immediate;
1 20,9,14891 0000000077D2DCF8 KILLED plsqldev32.exe PSEUDO alter system disconnect session '20,9' immediate;
1 23,35,14885 0000000077C99710 INACTIVE sqlplus.exe DEDICATED alter system disconnect session '23,35' immediate;
1 125,9,14964 0000000077C98660 ACTIVE sqlplus@orcltest (TNS V1-V3) DEDICATED alter system disconnect session '125,9' immediate;
SQL> select INST_ID, spid, program,A.PNAME,A.PID
2 from gv$process a
3 where a.PNAME='PMON';
INST_ID SPID PROGRAM PNAME PID
---------- ---------- ---------------------------------------- ----- ----------
1 13955 oracle@orcltest (PMON) PMON 2
SQL> oradebug wakeup 2
Statement processed.
SQL>
SQL> SET LINESIZE 180
SQL> COLUMN spid FORMAT A10
SQL> COLUMN username FORMAT A10
SQL> COLUMN program FORMAT A40
SQL> SELECT s.inst_id,
2 s.sid,
3 s.serial#,
4 p.spid,
5 s.username,
6 s.program,
7 s.paddr,
8 s.STATUS,
9 s.server
10 FROM gv$session s
11 left outer JOIN gv$process p
12 ON p.addr = s.paddr
13 AND p.inst_id = s.inst_id
14 WHERE s.type != 'BACKGROUND';
INST_ID SID SERIAL# SPID USERNAME PROGRAM PADDR STATUS SERVER
---------- ---------- ---------- ---------- ---------- ---------------------------------------- ---------------- -------- ---------
1 125 9 14964 SYS sqlplus@orcltest (TNS V1-V3) 0000000077C98660 ACTIVE DEDICATED
1 23 35 14885 SYS sqlplus.exe 0000000077C99710 INACTIVE DEDICATED
1 20 11 14966 SYS plsqldev32.exe 0000000077C9B870 INACTIVE DEDICATED
1 9 177 14968 SYS plsqldev32.exe 0000000077CA5F50 INACTIVE DEDICATED
SQL>
kill session immediate的测试:
SQL> set line 9999
SQL> col sessionid format a20
SQL> col sessionid_killed format a20
SQL> col kill_session format a60
SQL>
SQL> SELECT a.INST_ID,
2 a.SID || ',' || a.SERIAL# || ',' ||
3 (select spid
4 from gv$process b
5 where b.INST_ID = a.INST_ID
6 and A.creator_addr = b.ADDR --and decode(a.status,'KILLED',A.creator_addr,A.PADDR) = b.ADDR
7 ) sessionid,
8 a.PADDR,
9 a.STATUS,
10 a.PROGRAM,
11 a.server,
12 'alter system disconnect session ''' || sid || ',' || serial# || ''' immediate;' kill_session
13 FROM gv$session a
14 WHERE a.type != 'BACKGROUND';
INST_ID SESSIONID PADDR STATUS PROGRAM SERVER KILL_SESSION
---------- -------------------- ---------------- -------- ---------------------------------------- --------- ------------------------------------------------------------
1 9,169,14901 0000000077D2DCF8 KILLED sqlplus.exe PSEUDO alter system disconnect session '9,169' immediate;
1 20,9,14891 0000000077D2DCF8 KILLED plsqldev32.exe PSEUDO alter system disconnect session '20,9' immediate;
1 23,35,14885 0000000077C99710 INACTIVE sqlplus.exe DEDICATED alter system disconnect session '23,35' immediate;
1 125,7,14873 0000000077C98660 ACTIVE sqlplus@orcltest (TNS V1-V3) DEDICATED alter system disconnect session '125,7' immediate;
1 145,23,14651 0000000077CA9160 INACTIVE plsqldev32.exe DEDICATED alter system disconnect session '145,23' immediate;
SQL> alter system kill session '145,23' immediate;
System altered.
SQL> set line 9999
SQL> col sessionid format a20
SQL> col sessionid_killed format a20
SQL> col kill_session format a60
SQL>
SQL> SELECT a.INST_ID,
2 a.SID || ',' || a.SERIAL# || ',' ||
3 (select spid
4 from gv$process b
5 where b.INST_ID = a.INST_ID
6 and A.creator_addr = b.ADDR --and decode(a.status,'KILLED',A.creator_addr,A.PADDR) = b.ADDR
7 ) sessionid,
8 a.PADDR,
9 a.STATUS,
10 a.PROGRAM,
11 a.server,
12 'alter system disconnect session ''' || sid || ',' || serial# || ''' immediate;' kill_session
13 FROM gv$session a
14 WHERE a.type != 'BACKGROUND';
INST_ID SESSIONID PADDR STATUS PROGRAM SERVER KILL_SESSION
---------- -------------------- ---------------- -------- ---------------------------------------- --------- ------------------------------------------------------------
1 9,169,14901 0000000077D2DCF8 KILLED sqlplus.exe PSEUDO alter system disconnect session '9,169' immediate;
1 20,9,14891 0000000077D2DCF8 KILLED plsqldev32.exe PSEUDO alter system disconnect session '20,9' immediate;
1 23,35,14885 0000000077C99710 INACTIVE sqlplus.exe DEDICATED alter system disconnect session '23,35' immediate;
1 125,7,14873 0000000077C98660 ACTIVE sqlplus@orcltest (TNS V1-V3) DEDICATED alter system disconnect session '125,7' immediate;
SQL>
可以看到若加上immediate的话会话没有事务的情况下会立即释放,且v$session视图也被清理了。
---------------------------------------------------------------------------------------------------------------------
原文地址:【Kill】两条Linux命令彻底杀死Oracle 作者:secooler
$ ps -ef |grep $ORACLE_SID|grep -v grep|awk '{print $2}' | xargs kill -9
$ ipcs -m | grep oracle | awk '{print $2}' | xargs ipcrm shm
这两条命令的来由:
今天处理了三起Oracle数据库无故无法登录的问题,无论你采取什么手段都无法登陆到Oracle的SQL*PLus中,更谈不上故障排查了。
这种情况下有两种选择,第一种选择是经过一个较长的时间来排查故障原因,另外一种方法是不惜一切代价启动数据库以便尽快恢复生产。
如果是测试实验环境可以使用第一种方法来完成,对于排查问题的经验积累是有意的。
如果是生产环境,减少停机时间是最最至上的原则,所以我们只有不惜一切代价让数据库尽快恢复使用。
对于尽快恢复生产的方法主要也有两种。
第一种不惜代价的方法是重启服务器主机,彻底释放一切资源,重新来过,这种方法是有效的,不过停机时间还是有点长,而且技术含量比较低,所以不是很推荐使用这种超级“简单粗暴”的方法。
第二种方法就是我将要给大家展示的方法。
1.第一步:使用Linux的kill命令杀死所有与oracle有关的进程。
1)查询到与ORACLE_SID相关的oracle进程
$ ps -ef |grep $ORACLE_SID
oracle 7776 1 0 22:51 ? 00:00:00 ora_pmon_ora10g
oracle 7778 1 0 22:51 ? 00:00:00 ora_psp0_ora10g
oracle 7780 1 0 22:51 ? 00:00:00 ora_mman_ora10g
oracle 7782 1 0 22:51 ? 00:00:00 ora_dbw0_ora10g
oracle 7784 1 0 22:51 ? 00:00:00 ora_dbw1_ora10g
oracle 7786 1 0 22:51 ? 00:00:00 ora_lgwr_ora10g
oracle 7788 1 0 22:51 ? 00:00:00 ora_ckpt_ora10g
oracle 7790 1 0 22:51 ? 00:00:00 ora_smon_ora10g
oracle 7792 1 0 22:51 ? 00:00:00 ora_reco_ora10g
oracle 7794 1 0 22:51 ? 00:00:00 ora_cjq0_ora10g
oracle 7796 1 0 22:51 ? 00:00:00 ora_mmon_ora10g
oracle 7798 1 0 22:51 ? 00:00:00 ora_mmnl_ora10g
oracle 7832 1 0 22:51 ? 00:00:00 ora_arc0_ora10g
oracle 7834 1 0 22:51 ? 00:00:00 ora_arc1_ora10g
oracle 7836 1 0 22:51 ? 00:00:00 ora_qmnc_ora10g
oracle 7842 1 0 22:51 ? 00:00:00 ora_q000_ora10g
oracle 7847 1 0 22:52 ? 00:00:00 ora_q001_ora10g
oracle 7951 7592 0 23:11 pts/2 00:00:00 grep ora10g
2)去除掉包含grep命令本身的记录
$ ps -ef |grep $ORACLE_SID |grep -v grep
oracle 7776 1 0 22:51 ? 00:00:00 ora_pmon_ora10g
oracle 7778 1 0 22:51 ? 00:00:00 ora_psp0_ora10g
oracle 7780 1 0 22:51 ? 00:00:00 ora_mman_ora10g
oracle 7782 1 0 22:51 ? 00:00:00 ora_dbw0_ora10g
oracle 7784 1 0 22:51 ? 00:00:00 ora_dbw1_ora10g
oracle 7786 1 0 22:51 ? 00:00:00 ora_lgwr_ora10g
oracle 7788 1 0 22:51 ? 00:00:00 ora_ckpt_ora10g
oracle 7790 1 0 22:51 ? 00:00:00 ora_smon_ora10g
oracle 7792 1 0 22:51 ? 00:00:00 ora_reco_ora10g
oracle 7794 1 0 22:51 ? 00:00:00 ora_cjq0_ora10g
oracle 7796 1 0 22:51 ? 00:00:00 ora_mmon_ora10g
oracle 7798 1 0 22:51 ? 00:00:00 ora_mmnl_ora10g
oracle 7832 1 0 22:51 ? 00:00:00 ora_arc0_ora10g
oracle 7834 1 0 22:51 ? 00:00:00 ora_arc1_ora10g
oracle 7836 1 0 22:51 ? 00:00:00 ora_qmnc_ora10g
oracle 7842 1 0 22:51 ? 00:00:00 ora_q000_ora10g
oracle 7847 1 0 22:52 ? 00:00:00 ora_q001_ora10g
3)使用awk命令得到我们关心的进程号
$ ps -ef |grep $ORACLE_SID |grep -v grep|awk '{print $2}'
7776
7778
7780
7782
7784
7786
7788
7790
7792
7794
7796
7798
7832
7834
7836
7842
7847
4)万事俱备,我们最后使用kill命令将oracle的进程杀死,因此得到了下面完整的命令
$ ps -ef |grep $ORACLE_SID |grep -v grep|awk '{print $2}' | xargs kill -9
2.第二步:使用Linux的ipcs和ipcsrm命令释放oracle占用的共享内存。
1)使用ipcs命令查看系统中共享内存使用情况
$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 32768 root 644 72 2 dest
0x00000000 65537 root 644 16384 2 dest
0x00000000 98306 root 644 280 2 dest
0x00000000 131075 root 644 790528 2 dest
0x00000000 163844 root 644 790528 2 dest
0x00000000 196613 root 644 790528 2 dest
0x00000000 327689 oracle 644 790528 2 dest
0x00000000 360458 oracle 644 790528 2 dest
0x00000000 393227 oracle 644 790528 2 dest
0xecc5fba0 786447 oracle 640 5370806272 30
2)使用grep命令过滤后得到与oracle相关的内容
$ ipcs -m | grep oracle
0x00000000 327689 oracle 644 790528 2 dest
0x00000000 360458 oracle 644 790528 2 dest
0x00000000 393227 oracle 644 790528 2 dest
0xecc5fba0 786447 oracle 640 5370806272 31
2)使用awk命令获得上面我们关心的shmid字段内容
$ $ ipcs -m | grep oracle | awk '{print $2}'
327689
360458
393227
786447
3)最后使用ipcsrm命令释放共享内存
$ ipcs -m | grep oracle | awk '{print $2}' | xargs ipcrm shm
resource(s) deleted
再次查看一下,此时共享内存已经被释放。
$ ipcs -m | grep oracle
0x00000000 327689 oracle 644 790528 2 dest
0x00000000 360458 oracle 644 790528 2 dest
0x00000000 393227 oracle 644 790528 2 dest
0x00000000 786447 oracle 640 5370806272 31 dest
3.此时,我们便可以登录到数据库,最后启动数据库恢复生产。
4.小结
这种方法相对也是比较“粗暴”的。不过在危难之时还是可以派上用场。
重点强调:
手工杀掉oracle进程和手工释放共享内存是非常危险的,不到万不得已,千万不要使用;
这里我给出的两条极具杀伤力的命令,请不要轻易尝试。
Good luck.
-- The End --
disconnect session和kill session的区别
最常用的杀死oracle进程的方法是alter system kill session ‘sid,serial#’。
但是今天遇到一些意外,数据库有个从6月12号运行至今的sql,经客户确认需要将其干掉,登陆数据库执行alter system kill。
于是有了以下情形
SQL> alter system kill session '137,7818';
alter system kill session '137,7818'
*
ERROR at line 1:
ORA-00031: session marked for kill
SQL> select status,event from v$session where sid = 137;
STATUS EVENT
-------- ----------------------------------------------------------------
KILLED SQL*Net more data from dblink
SQL> select object_id,locked_mode,session_id from v$locked_object;
OBJECT_ID LOCKED_MODE SESSION_ID
---------- ----------- ----------
165 3 137
104489 3 137
212 3 137
SQL> select TYPE,LMODE,REQUEST,BLOCK from v$lock where sid=137;
TY LMODE REQUEST BLOCK
-- ---------- ---------- ----------
JQ 6 0 0
JI 6 0 0
TM 3 0 0
TM 3 0 0
TM 3 0 0
TX 6 0 0
SQL> select t.status, s.status from v$transaction t, v$session s where s.taddr = t.addr and s.sid=137;
STATUS STATUS
---------------- --------
ACTIVE KILLED
该session已经被标志为killed,但是其对应的transaction依旧为active,且对应的lock没有被释放;
又因为该instance由其他OS用户启动,当前登录的用户没有权限执行kill -9
ora_10@justin_$ ps -ef | grep 15616
ora_xxx 15616 1 0 Jul 06 ? 0:22 ora_j001_GLIMSP
ora_10 20035 17648 0 08:23:18 pts/7 0:00 grep 15616
ora_10@justin_$ kill -9 15616
kill: 15616: permission denied
不是太清楚到底发生了什么事情,但此时可使用disconnect session,请参考以下解释
The KILL SESSION command doesn’t actually kill the session. It merely asks the session to kill itself. In some situations, like waiting for a reply from a remote database or rolling back transactions, the session will not kill itself immediately and will wait for the current operation to complete. In these cases the session will have a status of “marked for kill”. It will then be killed as soon as possible.
The ALTER SYSTEM DISCONNECT SESSION syntax as an alternative method for killing Oracle sessions. Unlike the KILL SESSION command which asks the session to kill itself, the DISCONNECT SESSION command kills the dedicated server process (or virtual circuit when using Shared Sever), which is equivalent to killing the server process from the operating system. The basic syntax is similar to the KILL SESSION command with the addition of the POST_TRANSACTION clause. The SID and SERIAL# values of the relevant session can be substituted into one of the following statements.
The POST_TRANSACTION clause waits for ongoing transactions to complete before disconnecting the session, while the IMMEDIATE clause disconnects the session and ongoing transactions are recovered immediately.
http://fatihacar.com/blog/show-and-kill-transaction-lock-in-oracle/
SQL> alter system disconnect session '137,7818' immediate;
System altered.
SQL> select serial#,status,event from v$session where sid=137;
SERIAL# STATUS
---------- --------
EVENT
----------------------------------------------------------------
7822 ACTIVE
jobq slave wait
SQL> alter system disconnect session '137,7822' immediate;
System altered.
SQL> select serial#,status,event from v$session where sid=137;
no rows selected
SQL> select object_id,locked_mode,session_id from v$locked_object;
OBJECT_ID LOCKED_MODE SESSION_ID
---------- ----------- ----------
165 3 132
104489 3 132
212 3 132
SQL> select serial#,event,status,sql_id from v$session where sid=132;
SERIAL# EVENT
---------- ----------------------------------------------------------------
STATUS SQL_ID
-------- -------------
24231 jobq slave wait
ACTIVE
SQL> alter system disconnect session '132,24231' immediate;
System altered.
SQL> select object_id,locked_mode,session_id from v$locked_object;
no rows selected
--此时session被彻底清除,对应的lock也已释放
About Me
...............................................................................................................................
● 本文作者:小麦苗,只专注于数据库的技术,更注重技术的运用
● 本文在itpub(http://blog.itpub.net/26736162)、博客园(http://www.cnblogs.com/lhrbest)和个人微信公众号(xiaomaimiaolhr)上有同步更新
● 本文itpub地址:http://blog.itpub.net/26736162/viewspace-2121019/
● 本文博客园地址:http://www.cnblogs.com/lhrbest
● 本文pdf版及小麦苗云盘地址:http://blog.itpub.net/26736162/viewspace-1624453/
● 数据库笔试面试题库及解答:http://blog.itpub.net/26736162/viewspace-2134706/
● QQ群:230161599 微信群:私聊
● 联系我请加QQ好友(646634621),注明添加缘由
● 于 2016-06-15 10:00~ 2016-06-26 19:00 在魔都完成
● 文章内容来源于小麦苗的学习笔记,部分整理自网络,若有侵权或不当之处还请谅解
● 版权所有,欢迎分享本文,转载请保留出处
...............................................................................................................................
拿起手机使用微信客户端扫描下边的左边图片来关注小麦苗的微信公众号:xiaomaimiaolhr,扫描右边的二维码加入小麦苗的QQ群,学习最实用的数据库技术。