各位大牛,本人遇到一个pg锁的问题。
场景是这样的:有六个进程,一共开了六个连接,对pg同一张表做操作。但发现从昨天开始进程们都不在运作了。pg的日志暂时看不到,以下是查看pg_locks和pg_stat_activity的信息。
pg_locks表:只列出两个进程的锁情况,六个都是这样的。已经知道其中AccessShareLock是跟对应表中5个index和表本身相关的锁,ExclusiveLock不太明白是怎么产生的。
pg_stat_activity表:
请问各位:
1.只通过以上两张图和现象,能断定是否是死锁了?
2.如果死锁,如何能够进一步分析?目前现场保留的,没法等太久
3.对于ExclusiveLock我不是很理解,能否解释下
4.因为通篇也就涉及到这两种所,那么ExclusiveLock和AccessShareLock相互作用真的会死锁吗?能否进一步解释下或者列举某种场景。
any answer for any question will be appreciate.
HI,要搞清楚锁,首先要搞清楚,PG是有很多可以加锁的对象的,每种对象下面,再去看它的锁冲突。
可以加锁的对象:
LOCKTAG_RELATION, /* whole relation */
/* ID info for a relation is DB OID + REL OID; DB OID = 0 if shared */
LOCKTAG_RELATION_EXTEND, /* the right to extend a relation */
/* same ID info as RELATION */
LOCKTAG_PAGE, /* one page of a relation */
/* ID info for a page is RELATION info + BlockNumber */
LOCKTAG_TUPLE, /* one physical tuple */
/* ID info for a tuple is PAGE info + OffsetNumber */
LOCKTAG_TRANSACTION, /* transaction (for waiting for xact done) */
/* ID info for a transaction is its TransactionId */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
/* ID info for a virtual transaction is its VirtualTransactionId */
LOCKTAG_SPECULATIVE_TOKEN, /* speculative insertion Xid and token */
/* ID info for a transaction is its TransactionId */
LOCKTAG_OBJECT, /* non-relation database object */
/* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
/*
* Note: object ID has same representation as in pg_depend and
* pg_description, but notice that we are constraining SUBID to 16 bits.
* Also, we use DB OID = 0 for shared objects such as tablespaces.
*/
LOCKTAG_USERLOCK, /* reserved for old contrib/userlock code */
LOCKTAG_ADVISORY /* advisory user locks */
锁模式
/* NoLock is not a lock mode, but a flag value meaning "don't get a lock" */
#define NoLock 0
#define AccessShareLock 1 /* SELECT */
#define RowShareLock 2 /* SELECT FOR UPDATE/FOR SHARE */
#define RowExclusiveLock 3 /* INSERT, UPDATE, DELETE */
#define ShareUpdateExclusiveLock 4 /* VACUUM (non-FULL),ANALYZE, CREATE
* INDEX CONCURRENTLY */
#define ShareLock 5 /* CREATE INDEX (WITHOUT CONCURRENTLY) */
#define ShareRowExclusiveLock 6 /* like EXCLUSIVE MODE, but allows ROW
* SHARE */
#define ExclusiveLock 7 /* blocks ROW SHARE/SELECT...FOR
* UPDATE */
#define AccessExclusiveLock 8 /* ALTER TABLE, DROP TABLE, VACUUM
* FULL, and unqualified LOCK TABLE */
然后我们说一下死锁,死锁相互等待造成的,你给的图没有等待,所以一定不是死锁。
最后提供一个查询锁和等待的方法给你,会比较好看。
用一个函数来将锁转换为数字,
postgres=# create or replace function f_lock_level(i_mode text) returns int as
$$
declare
begin
case i_mode
when 'INVALID' then return 0;
when 'AccessShareLock' then return 1;
when 'RowShareLock' then return 2;
when 'RowExclusiveLock' then return 3;
when 'ShareUpdateExclusiveLock' then return 4;
when 'ShareLock' then return 5;
when 'ShareRowExclusiveLock' then return 6;
when 'ExclusiveLock' then return 7;
when 'AccessExclusiveLock' then return 8;
else return 0;
end case;
end;
$$
language plpgsql strict;
修改查询语句,按锁级别排序:
with t_wait as
(select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.objid,a.objsubid,
a.pid,a.virtualtransaction,a.virtualxid,a,transactionid,b.query,b.xact_start,b.query_start,
b.usename,b.datname from pg_locks a,pg_stat_activity b where a.pid=b.pid and not a.granted),
t_run as
(select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.objid,a.objsubid,
a.pid,a.virtualtransaction,a.virtualxid,a,transactionid,b.query,b.xact_start,b.query_start,
b.usename,b.datname from pg_locks a,pg_stat_activity b where a.pid=b.pid and a.granted)
select r.locktype,r.mode r_mode,r.usename r_user,r.datname r_db,r.relation::regclass,r.pid r_pid,
r.page r_page,r.tuple r_tuple,r.xact_start r_xact_start,r.query_start r_query_start,
now()-r.query_start r_locktime,r.query r_query,w.mode w_mode,w.pid w_pid,w.page w_page,
w.tuple w_tuple,w.xact_start w_xact_start,w.query_start w_query_start,
now()-w.query_start w_locktime,w.query w_query
from t_wait w,t_run r where
r.locktype is not distinct from w.locktype and
r.database is not distinct from w.database and
r.relation is not distinct from w.relation and
r.page is not distinct from w.page and
r.tuple is not distinct from w.tuple and
r.classid is not distinct from w.classid and
r.objid is not distinct from w.objid and
r.objsubid is not distinct from w.objsubid and
r.transactionid is not distinct from w.transactionid and
r.pid <> w.pid
order by f_lock_level(w.mode)+f_lock_level(r.mode) desc,r.xact_start;
现在可以排在前面的就是锁级别高的等待,优先干掉这个。
-[ RECORD 1 ]-+---------------------------------------------------------------------
locktype | relation -- 冲突类型
r_mode | ShareUpdateExclusiveLock -- 持锁模式
r_user | postgres -- 持锁用户
r_db | postgres -- 持锁数据库
relation | tbl -- 持锁对象
r_pid | 25656 -- 持锁进程
r_xact_start | 2015-05-10 14:11:16.08318+08 -- 持锁事务开始时间
r_query_start | 2015-05-10 14:11:16.08318+08 -- 持锁SQL开始时间
r_locktime | 00:01:49.460779 -- 持锁时长
r_query | vacuum freeze tbl; -- 持锁SQL,注意不一定是这个SQL带来的锁,也有可能是这个事务在之前执行的SQL加的锁
w_mode | AccessExclusiveLock -- 等待锁模式
w_pid | 26731 -- 等待锁进程
w_xact_start | 2015-05-10 14:11:17.987362+08 -- 等待锁事务开始时间
w_query_start | 2015-05-10 14:11:17.987362+08 -- 等待锁SQL开始时间
w_locktime | 00:01:47.556597 -- 等待锁时长
w_query | truncate tbl; -- 等待锁SQL
-[ RECORD 2 ]-+---------------------------------------------------------------------
locktype | relation
r_mode | ShareUpdateExclusiveLock
r_user | postgres
r_db | postgres
relation | tbl
r_pid | 25656
r_xact_start | 2015-05-10 14:11:16.08318+08
r_query_start | 2015-05-10 14:11:16.08318+08
r_locktime | 00:01:49.460779
r_query | vacuum freeze tbl;
w_mode | RowExclusiveLock
w_pid | 25582
w_xact_start | 2015-05-10 14:11:22.845+08
w_query_start | 2015-05-10 14:11:22.845+08
w_locktime | 00:01:42.698959
w_query | insert into tbl(crt_time) select now() from generate_series(1,1000); -- 这个SQL其实等待的是truncate tbl的锁;
......
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。