MySQL数据库Innodb存储引擎的状态报告时MySQL数据库性能分析和诊断的重要的工具。这个状态报告涉及的内容很多,包含存储引擎的各个方面,涉及的知识点也比较多,理解起来比较困难。本文通过实际案例来说明这个状态报告中比较重要部分的含义及这个部分如何用于性能诊断。
1 环境准备
本文通过sysbench模拟数据库中的业务负载,通过观察数据库在由于负载的情况下innodb状态报告的变化来说明各个部分的含义及如何用于数据库性能诊断。查看状态之前先准备一下数据。
docker run --rm=true--name=sb-prepare 0e71335a2211 sysbench --test=/usr/share/sysbench/oltp_read_only.lua --mysql-host=172.20.11.244 --mysql-port=3306--mysql-db=sysbenchdb --mysql-user="u_sysbench"--mysql-password='123456'--tables=4--table_size=100000--threads=2--time=300--report-interval=3--db-driver=mysql --db-ps-mode=disable --skip-trx=on --mysql-ignore-errors=6002,6004,4012,2013,4016,1062 prepare
这里使用docker运行sysbench,向数据库内装载数据,创建4个表,每个表10万行数据,通过容器的输出可以看到sysbench在创建表、载入数据后,为每个表有创建了二级索引。登录数据库查看索引情况
mysql> show index from sbtest4;+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+|Table| Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed |Null| Index_type | Comment | Index_comment |+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+| sbtest4 |0| PRIMARY |1| id | A |98712|NULL|NULL|| BTREE |||| sbtest4 |1| k_4 |1| k | A |17501|NULL|NULL|| BTREE |||+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+2 rows inset(0.00 sec)
可以看到,表列k上已有名为k_4的二级索引,为了模拟的环境真实一些,drop掉这个二级索引。
mysql>altertable sbtest4 drop key k_4; Query OK,0 rows affected (0.02 sec) Records:0 Duplicates:0 Warnings:0
2 业务加载前的innodb状态报告
Innodb存储引擎的状态报告用show engine innodb status命令获得
mysql> show engine innodb status\G; *************************** 1. row *************************** Type: InnoDB Name: Status: =====================================2022-08-2410:26:21 0x7fd3b8553700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 37 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 15 srv_active, 0 srv_shutdown, 1119 srv_idle srv_master_thread log flush and writes: 1134---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 164 OS WAIT ARRAY INFO: signal count 149 RW-shared spins 0, rounds 68, OS waits 34 RW-excl spins 0, rounds 751, OS waits 12 RW-sx spins 37, rounds 1110, OS waits 37 Spin rounds per wait: 68.00 RW-shared, 751.00 RW-excl, 30.00 RW-sx ------------ TRANSACTIONS ------------ Trx id counter 17916 Purge donefor trx's n:o < 17915 undo n:o < 0 state: running but idle' History list length 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION422022744110928, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---------------------- ROW OPERATIONS --------------0 queries inside InnoDB, 0 queries in queue 0 read views open inside InnoDB Process ID=45169, Main thread ID=140547469104896, state: sleeping Number of rows inserted 400000, updated 0, deleted 0, read 80.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================1 row inset (0.01 sec) ERROR: No query specified
上面的报告中删除了一些不太重要的内容,读这个报告之前,首先要知道这个报告里的值的时间含义时不同的,正如报告开头所说的,报告里每秒的平均值的计算时间时过去37秒,也就是说这个平均值是当前的,几乎时实时的,报告里的其他值则是累加的,是从数据库启动以来的这个值的总数。
报告的第一部分是主线程的状态,如果主线程srv_active loop数量比较多,则数据库的负载比较重。
报告的第二部分是信号量的状态,这个部分十分重要,理解起来也十分困难,如果理解了这个部分,通过信号量的状态对数据库目前的加锁情况就会有一个准确的理解。
Innodb的锁有三种RW-shared,RW-excl,RW-sx,简单来说,会话再获得一个锁时,如果不能获得就会进入自旋状态,在这个状态下,该会话仍然占用当前cpu,经过一段时间后会再次视图获取锁,如果经过一定的次数后仍未获得锁,会话就会进入os wait状态,这时会话会释放CPU,等待操作系统唤醒后再此进入自旋状态。
这个部分的值是累加的,所以需要对比相邻两个时间段的值才有意义。
后面的事务部分和行操作部分后面再做解释。
3 业务加载,查看innodb状态报告
运行sysbench,模拟业务加载。
docker run --name=sb-run 0e71335a2211 sysbench --test=/usr/share/sysbench/oltp_read_only.lua --mysql-host=172.20.11.244 --mysql-port=3306 --mysql-db=sysbenchdb --mysql-user="u_sysbench" --mysql-password=123456 --tables=4 --table_size=100000 --threads=4 --time=2000 --report-interval=10 --db-driver=mysql --db-ps-mode=disable --skip-trx=on --mysql-ignore-errors=6002,6004,4012,2013,4016,1062 run
sysbench运行了4个线程,运行的是只读查询。加载后,系统的负载也发生了变化,这个可以用操作系统的top命令看出。
[root@ ~]# top -p 45169 -b -n 1 -Htop-11:12:51 up 9 days, 1:34, 3 users, load average: 4.29, 4.41, 4.05 Threads: 32 total, 2 running, 30 sleeping, 0 stopped, 0 zombie %Cpu(s): 67.8 us, 23.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 8.0 si, 0.0 st MiB Mem : 1816.9 total, 120.1 free, 564.9 used, 1131.9 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 1212.0 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 45197 mysql 200116748033314422700 R 19.6 17.9 7:28.07 mysqld 45318 mysql 200116748033314422700 S 19.6 17.9 6:29.69 mysqld 45456 mysql 200116748033314422700 S 19.6 17.9 7:26.48 mysqld 45455 mysql 200116748033314422700 R 19.3 17.9 7:26.47 mysqld 45169 mysql 200116748033314422700 S 0.0 17.9 0:00.26 mysqld
MySQL下面运行了32个线程,cpu利用率达到了67.8%,有四个线程(对应sysbench)的四个线程的cpu利用率在19%左右。
看一下这个时候的引擎状态报告,由于报告比较长,对各部分的解释以注释的形式给出,不再在报告后单独说明。
mysql> show engine innodb status\G;***************************1. row *************************** Type: InnoDB Name: Status:=====================================2022-08-2410:58:510x7fd3b848d700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 27 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops:33 srv_active,0 srv_shutdown,3051 srv_idle srv_master_thread log flush and writes:3084----------主线程的活跃loop有所增加,但是增加不大。 SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count166 OS WAIT ARRAY INFO: signal count151 RW-shared spins 0, rounds 68, OS waits 34 RW-excl spins 0, rounds 751, OS waits 12 RW-sx spins 37, rounds 1110, OS waits 37 Spin rounds per wait:68.00 RW-shared,751.00 RW-excl,30.00 RW-sx ------------这部分数据同前面的几乎报告相同,说明数据库这段时间几乎没有加锁。 TRANSACTIONS ------------ Trx id counter 17916 Purge done for trx's n:o < 17915 undo n:o < 0 state: running but idle History list length 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 422022744113664, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 422022744112752, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 422022744111840, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 422022744110928, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 422022744114576, not started 0 lock struct(s), heap size 1136, 0 row lock(s) --------数据库里有了活跃的事务,这个事务都没有枷锁 FILE I/O -------- I/O thread 0 state: waiting for completed aio requests (insert buffer thread) I/O thread 1 state: waiting for completed aio requests (log thread) I/O thread 2 state: waiting for completed aio requests (read thread) I/O thread 3 state: waiting for completed aio requests (read thread) I/O thread 4 state: waiting for completed aio requests (read thread) I/O thread 5 state: waiting for completed aio requests (read thread) I/O thread 6 state: waiting for completed aio requests (write thread) I/O thread 7 state: waiting for completed aio requests (write thread) I/O thread 8 state: waiting for completed aio requests (write thread) I/O thread 9 state: waiting for completed aio requests (write thread) Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] , ibuf aio reads:, log i/o's:, sync i/o's: Pending flushes (fsync) log: 0; buffer pool: 0 2108 OS file reads, 7329 OS file writes, 1092 OS fsyncs 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s ------IO进程也不活跃,这段时间的平均值为0 ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 0, seg size 2, 0 merges merged operations: insert 0, delete mark 0, delete 0 discarded operations: insert 0, delete mark 0, delete 0 Hash table size 34673, node heap has 0 buffer(s) Hash table size 34673, node heap has 0 buffer(s) Hash table size 34673, node heap has 175 buffer(s) Hash table size 34673, node heap has 0 buffer(s) Hash table size 34673, node heap has 173 buffer(s) Hash table size 34673, node heap has 0 buffer(s) Hash table size 34673, node heap has 0 buffer(s) Hash table size 34673, node heap has 0 buffer(s) 6851.86 hash searches/s, 3947.78 non-hash searches/s ---哈希表的buffer值有所增加,这个时间间隔内发生了6851次hash查找, ---innodb运用了自适应哈希技术对sql语句进行了优化, LOG --- Log sequence number 10116339108 Log flushed up to 10116339108 Pages flushed up to 10116339108 Last checkpoint at 10116339099 0 pending log flushes, 0 pending chkp writes 293 log i/o's done,0.00 log i/o's/second ----------------------由于是只读测试,没有log产生 BUFFER POOL AND MEMORY ---------------------- Total large memory allocated 137428992 Dictionary memory allocated 331505 Buffer pool size 8191 Free buffers 1022 Database pages 6821 Old database pages 2497 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 2120, not young 503 0.00 youngs/s, 0.00 non-youngs/s Pages read 2073, created 5907, written 6319 0.00 reads/s, 0.00 creates/s, 0.00 writes/s Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 6821, unzip_LRU len: 0 I/O sum[0]:cur[0], unzip sum[0]:cur[0] --缓冲池计划没什么变化 -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue 0 read views open inside InnoDB Process ID=45169, Main thread ID=140547469104896, state: sleeping Number of rows inserted 400083, updated 0, deleted 0, read 310205858 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 203224.58 reads/s ----------------------------这段时间内读次数较多,达到 203224.58每秒。 END OF INNODB MONITOR OUTPUT ============================ 1 row in set (0.00 sec) ERROR: No query specified
4 行锁时的innodb状态报告
开两个会话,模拟数据库发生行锁
mysql>update country set country='Afuhan'where country_id=1;Query OK,1 row affected (0.00 sec)Rows matched:1 Changed:1 Warnings:0--用户会话2mysql>set autocommit=0;mysql> show variables like'autocommit%';+---------------+-------+| Variable_name | Value |+---------------+-------+| autocommit | OFF |+---------------+-------+1 row inset(0.00 sec)mysql>update country set country='afuhan'where country_id=1;ERROR 1205(HY000): Lock wait timeout exceeded; try restarting transaction
可以看到,MySQL的行锁有超时设置,超时后会尝试重启事务。数据库产生行锁时,事务部分的内容变化最大,这里只截取这部分内容。
TRANSACTIONS ------------Trx id counter 17943Purge done for trx's n:o < 17941 undo n:o < 0 state: running but idleHistory list length 1LIST OF TRANSACTIONS FOR EACH SESSION:--会话事务列表,列出了所有会话---TRANSACTION 422022744116400, not started0 lock struct(s), heap size 1136, 0 row lock(s)---TRANSACTION 422022744114576, not started0 lock struct(s), heap size 1136, 0 row lock(s)---TRANSACTION 422022744113664, not started0 lock struct(s), heap size 1136, 0 row lock(s)---TRANSACTION 422022744112752, not started0 lock struct(s), heap size 1136, 0 row lock(s)---TRANSACTION 422022744111840, not started0 lock struct(s), heap size 1136, 0 row lock(s)---TRANSACTION 17942, ACTIVE 10 sec starting index read--这个事务持有锁,使用并锁定了一个表mysql tables in use 1, locked 1LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)--事务的线程id,查询id,执行的语句。MySQL thread id 22, OS thread handle 140547306608384, query id 20019936 localhost root updatingupdate country set country='afuhan' where country_id=1---事务等待的锁------- TRX HAS BEEN WAITING 10 SEC FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 27 page no 3 n bits 176 index PRIMARY of table `sakila`.`country` trx id 17942 lock_mode X locks rec but not gap waitingRecord lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 2; hex 0001; asc ;; 1: len 6; hex 000000004612; asc F ;; 2: len 7; hex 74000001610237; asc t a 7;; 3: len 6; hex 41667568616e; asc Afuhan;; 4: len 4; hex 6305beb7; asc c ;;---------------------TRANSACTION 17941, ACTIVE 17 sec--这个事务持有一个行锁2 lock struct(s), heap size 1136, 1 row lock(s)MySQL thread id 27, OS thread handle 140547306338048, query id 19970640 localhost root
从上面截取的片段可以看出,innodb引擎的状态报告里会列出不阻塞事务的详细信息,诸如执行的语句,等待的锁,对应的线程等,从这些信息,基本可以判断出行锁的来源。