问题追溯流程
死锁问题 → 共享间隙锁引起的死锁 → 如何产生共享间隙锁 → 何时产生的隐式锁转换
问题现象
在一个事务内只会锁一行的数据,没有锁多行数据才会出现的顺序问题,但是会偶尔报个Deadlock
事务内sql执行顺序如下:
前提
数据库隔离级别 为 RC
建表语句:
CREATE TABLE `user_money_account_test` ( `id` bigint(20) NOT NULL COMMENT '主键', `userId` bigint(20) NOT NULL COMMENT '用户id', `accountTypeId` int(11) NOT NULL COMMENT '账户类型id', `currencyId` int(11) NOT NULL COMMENT '货币id', `moneyBalance` decimal(30,4) NOT NULL COMMENT '货币余额', `createdTime` bigint(20) NOT NULL, `updatedTime` bigint(20) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `userId_accountTypeId` (`userId`,`accountTypeId`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 dbpartition by hash(`userId`) tbpartition by hash(`userId`) tbpartitions 4;
问题初探之一 —— 死锁日志
------------------------ LATEST DETECTED DEADLOCK ------------------------ 2023-11-03 15:54:57 0x7f556ddca700 *** (1) TRANSACTION: TRANSACTION 7103074779, ACTIVE 0 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 1528, OS thread handle 140005316216576, query id 109102299 100.104.192.64 athyxrnr statistics /*DRDS /10.11.43.140/117db74e43c34001-4/ */SELECT `user_money_account`.`id`, `user_money_account`.`userId`, `user_money_account`.`accountTypeId`, `user_money_account`.`currencyId`, `user_money_account`.`moneyBalance`, `user_money_account`.`createdTime`, `user_money_account`.`updatedTime` FROM `user_money_account_eGz3_008` AS `user_money_account` WHERE ((`user_money_account`.`userId` = 474232840) AND (`user_money_account`.`accountTypeId` = 202)) FOR UPDATE *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 526 page no 8711 n bits 704 index userId_accountTypeId of table `bolt_money_account_e5l6_0001`.`user_money_account_egz3_008` trx id 7103074779 lock_mode X locks rec but not gap waiting Record lock, heap no 318 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 800000001c443808; asc D8 ;; 1: len 4; hex 800000ca; asc ;; 2: len 8; hex 8139b4e738c01045; asc 9 8 E;; *** (2) TRANSACTION: TRANSACTION 7103074777, ACTIVE 0 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 6870, OS thread handle 140004892124928, query id 109102300 100.104.192.122 athyxrnr statistics /*DRDS /10.11.107.138/117db74e42312000-4/ */SELECT `user_money_account`.`id`, `user_money_account`.`userId`, `user_money_account`.`accountTypeId`, `user_money_account`.`currencyId`, `user_money_account`.`moneyBalance`, `user_money_account`.`createdTime`, `user_money_account`.`updatedTime` FROM `user_money_account_eGz3_008` AS `user_money_account` WHERE ((`user_money_account`.`userId` = 474232840) AND (`user_money_account`.`accountTypeId` = 202)) FOR UPDATE *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 526 page no 8711 n bits 704 index userId_accountTypeId of table `bolt_money_account_e5l6_0001`.`user_money_account_egz3_008` trx id 7103074777 lock mode S Record lock, heap no 318 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 800000001c443808; asc D8 ;; 1: len 4; hex 800000ca; asc ;; 2: len 8; hex 8139b4e738c01045; asc 9 8 E;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 526 page no 8711 n bits 704 index userId_accountTypeId of table `bolt_money_account_e5l6_0001`.`user_money_account_egz3_008` trx id 7103074777 lock_mode X locks rec but not gap waiting Record lock, heap no 318 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 8; hex 800000001c443808; asc D8 ;; 1: len 4; hex 800000ca; asc ;; 2: len 8; hex 8139b4e738c01045; asc 9 8 E;
问题一:
答:Tx1获取到了某行数据的间隙共享锁,之后Tx1和Tx2都去获取该行的排他行锁导致的死锁
问题探究之二:没有显式地lock in share mode 为什么会有共享间隙锁?
问题2.1 同时进行的只有两个事务吗?
问题2.1:答:第三个事务可能是产生共享间隙锁的关键
事务内sql语句还原:
时间点1时的锁状态:
select * from information_schema.innodb_trx;
select * from information_schema.innodb_locks;
select * from information_schema.innodb_lock_waits;
时间点2时的锁状态:
select * from information_schema.innodb_trx;
select * from information_schema.innodb_locks;
select * from information_schema.innodb_lock_waits;
问题二:答:同时进行的两个事务同时insert带来的共享间隙锁
问题追问之三 —— 为什么同时进行的两个事务同时insert会产生共享间隙锁
问题三 答 ——
INSERT操作在插入或更新记录时,检查到Duplicate key(或者有一个被标记删除的duplicate key),对于普通的INSERT/UPDATE,会加LOCK_S锁,而对于类似REPLACE INTO或者INSERT ..ON DUPLICATE这样的SQL加的是X锁。而针对不同的索引类型也有所不同:
- 对于聚集索引(参阅函数
row_ins_duplicate_error_in_clust
),隔离级别小于等于RC时,加的是LOCK_REC_NOT_GAP
类似的S或者X记录锁。否则加LOCK_ORDINARY
类型的记录锁(NEXT-KEY LOCK)- 对于二级唯一索引,若检查到重复键,当前版本总是加LOCK_ORDINARY类型的记录锁(函数
row_ins_scan_sec_index_for_duplicate
)。实际上按照RC的设计理念,不应该加GAP锁(bug#68021),官方也事实上尝试修复过一次,即对于RC隔离级别加上LOCK_REC_NOT_GAP,但却引入了另外一个问题,导致二级索引的唯一约束失效(bug#73170
),由于这个严重bug,官方很快又把这个fix给revert掉了。
问题扩展之四 —— sql执行时机发生变化时的效果
会产生什么结果呢??
一句话结论
Tx1和Tx2两个事务并行insert相同唯一索引的数据,导致先执行insert的Tx1获得了排他锁,Tx2等待获得共享锁;
在Tx1释放排他锁的时候,Tx2拿到了共享间隙锁,但此时另一个事务Tx3请求该行的排他锁,被阻塞;
之后Tx2也去请求该行的排他锁,至此,形成了Tx2和Tx3组成的环形等待,形成死锁
解决办法
将长事务拆分成多个小事务,不要在一个事务内对同一行数据既insert,又select ... for update
其他扩展问题
隔离级别为RR会有什么不同吗?
文章内容收录到个人网站,方便阅读:hardyfish.top/