五、关于锁的判定
5.1 lock_sec_rec_read_check_and_lock函数
主要用于二级索引数据查找段阶段加显示锁,,对于update/delete而言,首先是需要找到需要修改的数据,加锁前需要判断本记录是否存在隐式锁,由于二级索引行数据不包含trx id,因此先用page的max trx id和当前活跃的最小读写事务进行比对判断,如果大于等于则可能存在显示锁,然后需要回表通过主键进行精细化判断。而精细化回表判断行是否存在隐式锁,那么代价就比较大了,因此这需要一个判断流程如下
lock_sec_rec_read_check_and_lock: if ((page_get_max_trx_id(block->frame) >= trx_rw_min_trx_id() || recv_recovery_is_on()) && !page_rec_is_supremum(rec)) { lock_rec_convert_impl_to_expl(block, rec, index, offsets);//如果符合前面的条件才调入 lock_rec_convert_impl_to_expl }
如下调入:
->lock_rec_convert_impl_to_expl ->lock_sec_rec_some_has_impl ->row_vers_impl_x_locked 此处会进行聚集索引的回表,同样是通过二级索引进行定位返回btr_cur_search_to_nth_level ->row_vers_impl_x_locked_low 最后会调入 row_vers_impl_x_locked_low函数进行核心判断
栈如下:
#0 row_vers_impl_x_locked_low (clust_rec=0x7fff39a21226 "\200", clust_index=0x7ffeb5092680, rec=0x7fff39a2ac30 "\200", index=0x7ffeb5093610, offsets=0x7fffe8461730, mtr=0x7fffe8460e90) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/row/row0vers.cc:101 #1 0x0000000001b2c84e in row_vers_impl_x_locked (rec=0x7fff39a2ac30 "\200", index=0x7ffeb5093610, offsets=0x7fffe8461730) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/row/row0vers.cc:390 #2 0x00000000019e8448 in lock_sec_rec_some_has_impl (rec=0x7fff39a2ac30 "\200", index=0x7ffeb5093610, offsets=0x7fffe8461730) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/lock/lock0lock.cc:1276 #3 0x00000000019f339a in lock_rec_convert_impl_to_expl (block=0x7fff38d94ca0, rec=0x7fff39a2ac30 "\200", index=0x7ffeb5093610, offsets=0x7fffe8461730) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/lock/lock0lock.cc:6124 #4 0x00000000019f3dd2 in lock_sec_rec_read_check_and_lock (flags=0, block=0x7fff38d94ca0, rec=0x7fff39a2ac30 "\200", index=0x7ffeb5093610, offsets=0x7fffe8461730, mode=LOCK_X, gap_mode=1024, thr=0x7ffeb4c89358) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/lock/lock0lock.cc:6357 #5 0x0000000001af7271 in sel_set_rec_lock (pcur=0x7ffeb4c887d8, rec=0x7fff39a2ac30 "\200", index=0x7ffeb5093610, offsets=0x7fffe8461730, mode=3, type=1024, thr=0x7ffeb4c89358, mtr=0x7fffe8461a50) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/row/row0sel.cc:1278 #6 0x0000000001b00049 in row_search_mvcc (buf=0x7ffeb4977070 "\370\211\037", mode=PAGE_CUR_GE, prebuilt=0x7ffeb4c885b0, match_mode=1, direction=1) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/row/row0sel.cc:5710
但是需要注意的是,max trx id只会在二级索引上更新,并且每次更新一行都会更新掉,那么引起的一个问题就是如果连续多次删除同一个二级索引上的记录**(delete from testimp4 where b=7700;),除第一次以外都会调入row_vers_impl_x_locked_low这个函数,因为这是查询一行加锁一行修改一行(每行都会修改page的max trx id)的。但是update却不同,update如果修改本二级索引的值一般会进入(如:update testimp4 set b=1500 where b=1800;)Searching rows for update状态**,先建立一个临时文件来先存储需要更改的行记录,然后进行批量更改进入updating状态,那么则不会出现这种问题,因为这是在数据查找阶段进行的判断,而不是数据修改阶段。又比如**(如:update testimp2 set c='a' where b=1800)这样的语句也不会触发,这是因为b索引的行记录一直没有改变,因此不会修改b索引page的max trx id。因此update很好的规避了这个问题不会频繁的进入函数row_vers_impl_x_locked_low**进行判定,但是delete却不行。
关于row_vers_impl_x_locked_low函数对于二级索引是否存在隐式锁的判定,比较复杂分为好多种情况,不再描述。因此最开始我们看到的问题,这个过程已经进入了row_vers_impl_x_locked_low函数,那么可以判断这个delete语句可能更新了多行(但是从代码行数上判断不是这种情况),或者有可能本语句事务做过修改本语句修改记录的其他语句,需要进行精细化判断。
5.2 lock_sec_rec_modify_check_and_lock
主要用于数据修改阶段加隐式锁,二级索引由于行数据的修改(update修改了本二级索引字包含段值或者尾部的主键)而被动维护的加锁。注意如果是select for update where条件是主键则不会加判断二级索引是否包含隐含锁,如果出现冲突会堵塞在主键上。
5.3 lock_clust_rec_read_check_and_lock
数据查找阶段加显示锁,主要用于主键查找数据加显示锁或者二级索引访问后的回表主键加显示锁,加锁前需要判断是否存在隐含锁。由于主键行中包含了trx id伪列,因此可以简单的用本行trx id的事务是否还活跃进行判定了,这个过程代价很小,因此每行加锁总是会有这个过程,也就是每次都会调用lock_rec_convert_impl_to_expl函数进行判断,如下:
lock_clust_rec_read_check_and_lock ->lock_rec_convert_impl_to_expl ->lock_clust_rec_some_has_impl (主键判断非常简单)
栈如下:
#0 lock_clust_rec_some_has_impl (rec=0x7fff05ad40db "\200", index=0x7ffe8802ce70, offsets=0x7fffe8461660) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/include/lock0priv.ic:69 #1 0x00000000019f3333 in lock_rec_convert_impl_to_expl (block=0x7fff050a0950, rec=0x7fff05ad40db "\200", index=0x7ffe8802ce70, offsets=0x7fffe8461660) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/lock/lock0lock.cc:6118 #2 0x00000000019f418d in lock_clust_rec_read_check_and_lock (flags=0, block=0x7fff050a0950, rec=0x7fff05ad40db "\200", index=0x7ffe8802ce70, offsets=0x7fffe8461660, mode=LOCK_X, gap_mode=1024, thr=0x7ffeb49903c8) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/lock/lock0lock.cc:6430 #3 0x0000000001af7193 in sel_set_rec_lock (pcur=0x7ffeb498fe38, rec=0x7fff05ad40db "\200", index=0x7ffe8802ce70, offsets=0x7fffe8461660, mode=3, type=1024, thr=0x7ffeb49903c8, mtr=0x7fffe8461980) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/row/row0sel.cc:1263 #4 0x0000000001b00049 in row_search_mvcc (buf=0x7ffeb498f380 "\371\005", mode=PAGE_CUR_GE, prebuilt=0x7ffeb498fc10, match_mode=1, direction=0) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/row/row0sel.cc:5710
5.4 lock_clust_rec_modify_check_and_lock
主键数据修改阶段加隐式锁,当前发现为在直接update主键值或者delete操作的时候,但是这种情况下实际上主键已经在数据查询阶段加了显示锁。
六、update不完全等同于delete&&insert
直接区分如下:
- 主键更新,接口row_upd_clust_step
row_upd_changes_ord_field_binary 判断是否更新了聚集索引的值 如果更新了 -> row_upd_clust_rec_by_insert 进行主键删除插入(设置del flag) 如果没有更新 ->row_upd_clust_rec ->btr_cur_optimistic_update 只考虑乐观update ->row_upd_changes_field_size_or_external 判断新记录是否超过本行现有大小 如果否 ->btr_cur_update_in_place 原地更新 如果是 ->page_cur_delete_rec 则需要进行主键删除(实际删除非设置del falg) ->btr_cur_insert_if_possible 插入
- 二级索引更新,接口row_upd_sec_step 始终为删除插入(设置del flag)
七、关于History list length 的单位
实际上History list length 就是当一个update undo log (非insert)的计数器,一个事务只有一个undo log 。来源为trx_sys->rseg_history_len,这个值会在事务提交的时候更新,无论事务大小。但是由于很多内部事务的存在,这个值会远大于可观测的事务个数。栈如下:
#0 trx_purge_add_update_undo_to_history (trx=0x7fffeac7df50, undo_ptr=0x7fffeac7e370, undo_page=0x7fff2837c000 "\373\252\223T", update_rseg_history_len=true, n_added_logs=1, mtr=0x7fffe8399830) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/trx/trx0purge.cc:354 #1 0x0000000001b9c064 in trx_undo_update_cleanup (trx=0x7fffeac7df50, undo_ptr=0x7fffeac7e370, undo_page=0x7fff2837c000 "\373\252\223T", update_rseg_history_len=true, n_added_logs=1, mtr=0x7fffe8399830) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/trx/trx0undo.cc:1970 #2 0x0000000001b8b639 in trx_write_serialisation_history (trx=0x7fffeac7df50, mtr=0x7fffe8399830) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/trx/trx0trx.cc:1684 #3 0x0000000001b8c9b0 in trx_commit_low (trx=0x7fffeac7df50, mtr=0x7fffe8399830) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/trx/trx0trx.cc:2184
到这里,杂七杂八记录了一大堆,记录于此以备后用。
附录1函数接口
1、read view
- MVCC::view_open:建立read view
- ReadView::prepare:准备read view中的值
- ReadView::complete:写入read view中的值
- MVCC::view_close:释放read view
2、可见性判断
- lock_clust_rec_cons_read_sees:主键可见性判断
- lock_sec_rec_cons_read_sees:二级索引可见性判断
附录 2具体函数
1、read view
/** The read should not see any transaction with trx id >= this value. In other words, this is the "high water mark". */ trx_id_t m_low_limit_id; /** The read should see all trx ids which are strictly smaller (<) than this value. In other words, this is the low water mark". */ trx_id_t m_up_limit_id; /** trx id of creating transaction, set to TRX_ID_MAX for free views. */ trx_id_t m_creator_trx_id; /** Set of RW transactions that was active when this snapshot was taken */ ids_t m_ids; /** The view does not need to see the undo logs for transactions whose transaction number is strictly smaller (<) than this value: they can be removed in purge if not needed by other views */ trx_id_t m_low_limit_no;
void ReadView::prepare(trx_id_t id) { ut_ad(!m_cloned); ut_ad(mutex_own(&trx_sys->mutex)); m_creator_trx_id = id; m_low_limit_no = m_low_limit_id = trx_sys->max_trx_id; if (!trx_sys->rw_trx_ids.empty()) { copy_trx_ids(trx_sys->rw_trx_ids); } else { m_ids.clear(); } if (UT_LIST_GET_LEN(trx_sys->serialisation_list) > 0) { const trx_t* trx; trx = UT_LIST_GET_FIRST(trx_sys->serialisation_list); if (trx->no < m_low_limit_no) { m_low_limit_no = trx->no; } } } void ReadView::complete() { ut_ad(!m_cloned); /* The first active transaction has the smallest id. */ m_up_limit_id = !m_ids.empty() ? m_ids.front() : m_low_limit_id; ut_ad(m_up_limit_id <= m_low_limit_id); m_closed = false; }
2、可见性判断
二级索引回表判断可见性 Row_sel_get_clust_rec_for_mysql::operator() ->lock_clust_rec_cons_read_sees (回表后根据主键判断其可见性) ->row_sel_build_prev_vers_for_mysql(构建前版本) ->row_vers_build_for_consistent_read 本函数循环构建,直到条件满足,或者前版本为NULL if (view->changes_visible(trx_id, index->table->name)) { /* The view already sees this version: we can copy it to in_heap and return */ buf = static_cast<byte*>( mem_heap_alloc( in_heap, rec_offs_size(*offsets))); *old_vers = rec_copy(buf, prev_version, *offsets); rec_offs_make_valid(*old_vers, index, *offsets); if (vrow && *vrow) { *vrow = dtuple_copy(*vrow, in_heap); dtuple_dup_v_fld(*vrow, in_heap); } break;
最终会将前版本的主键值根据需求取字段返回给MySQL层
关于using index 也需要回表流程
row_search_mvcc: if (!srv_read_only_mode && !lock_sec_rec_cons_read_sees( // 如果二级索引记录判断为不可见 rec, index, trx->read_view)) { /* We should look at the clustered index. However, as this is a non-locking read, we can skip the clustered index lookup if the condition does not match the secondary index entry. */ switch (row_search_idx_cond_check( buf, prebuilt, rec, offsets)) { case ICP_NO_MATCH: goto next_rec; case ICP_OUT_OF_RANGE: err = DB_RECORD_NOT_FOUND; goto idx_cond_failed; case ICP_MATCH: goto requires_clust_rec; //走这里就进入了回表判断流程 } lock_sec_rec_cons_read_sees: trx_id_t max_trx_id = page_get_max_trx_id(page_align(rec));//获取页的max trx id ut_ad(max_trx_id > 0); return(view->sees(max_trx_id));
全文完。
Enjoy MySQL :)