前言
在MySQL服务器断电或者kill mysqld进程后,再操作数据库会发生莫名其妙的重启,我们下面就根据一个案例,来逐步的进行分析和诊断,解决这种常见的问题。
故障现象
从一张表中查询数据时报如下错误
随后整理表时也报同样错误:
mysql> optimize table tablex ;
ERROR 2013 (HY000): Lost connection to MySQL server during query
这意味着无法直接对表进行修复工作。
如果加上limit从表中直接查询数据(select * from tablex limit 10000,1;),limit的值增长至198900时,重现上述错误。
故障初步分析
1.我们首先来看下当时的错误日志:
点击(此处)折叠或打开
- The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
- information that should help you find out what is causing the crash.
- 151203 12:31:23 mysqld_safe Number of processes running now: 0
- 151203 12:31:23 mysqld_safe mysqld restarted
- 2015-12-03 12:31:24 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
- 2015-12-03 12:31:24 0 [Note] /usr/local/mysql/bin/mysqld (mysqld 5.6.26-log) starting as process 2207 ...
- 2015-12-03 12:31:24 2207 [Note] Plugin 'FEDERATED' is disabled.
- 2015-12-03 12:31:24 7ff8243a6720 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Using atomics to ref count buffer pool pages
- 2015-12-03 12:31:24 2207 [Note] InnoDB: The InnoDB memory heap is disabled
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Memory barrier is not used
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Compressed tables use zlib 1.2.3
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Using Linux native AIO
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Using CPU crc32 instructions
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Initializing buffer pool, size = 2.5G
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Completed initialization of buffer pool
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Highest supported file format is Barracuda.
- 2015-12-03 12:31:24 2207 [Note] InnoDB: The log sequence numbers 20043317101 and 20043317101 in ibdata files do not match the log sequence number 20540489747 in the ib_logfiles!
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Database was not shutdown normally!
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Starting crash recovery.
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Reading tablespace information from the .ibd files...
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Restoring possible half-written data pages
- 2015-12-03 12:31:24 2207 [Note] InnoDB: from the doublewrite buffer...
- InnoDB: Last MySQL binlog file position 0 47771, file name mysql-bin.000036
- 2015-12-03 12:31:24 2207 [Note] InnoDB: 128 rollback segment(s) are active.
- 2015-12-03 12:31:24 2207 [Note] InnoDB: Waiting for purge to start
- 2015-12-03 12:31:24 2207 [Note] InnoDB: 5.6.26 started; log sequence number 20540489747
- 2015-12-03 12:31:24 2207 [Note] Recovering after a crash using /data/mysql3306/logs/mysql-bin
- 2015-12-03 12:31:24 2207 [Note] Starting crash recovery...
- 2015-12-03 12:31:24 2207 [Note] Crash recovery finished.
- 2015-12-03 12:31:25 2207 [Note] Server hostname (bind-address): '*'; port: 3306
- 2015-12-03 12:31:25 2207 [Note] IPv6 is available.
- 2015-12-03 12:31:25 2207 [Note] - '::' resolves to '::';
- 2015-12-03 12:31:25 2207 [Note] Server socket created on IP: '::'.
- 2015-12-03 12:31:25 2207 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
- 2015-12-03 12:31:25 2207 [Note] Event Scheduler: Loaded 0 events
- 2015-12-03 12:31:25 2207 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
- Version: '5.6.26-log' socket: '/data/mysql3306/mysql.sock' port: 3306 MySQL Community Server (GPL)
- 2015-12-03 12:31:25 2207 [Note] Slave I/O thread: connected to master 'replication@192.168.30.218:3306',replication started in log 'mysql-bin.000020' at position 47355
- 2015-12-03 12:31:25 2207 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0
- 2015-12-03 12:31:25 2207 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000020' at position 47355, relay log './mysqld-relay.000103' position: 283
- 2015-12-03 14:13:15 7ff723baa700 InnoDB: Assertion failure in thread 140699433084672 in file btr0pcur.cc line 430
- InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page)
- InnoDB: We intentionally generate a memory trap.
- InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
- InnoDB: If you get repeated assertion failures or crashes, even
- InnoDB: immediately after the mysqld startup, there may be
- InnoDB: corruption in the InnoDB tablespace. Please refer to
- InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
- InnoDB: about forcing recovery.
- 06:13:15 UTC - mysqld got signal 6 ;
- This could be because you hit a bug. It is also possible that this binary
- or one of the libraries it was linked against is corrupt, improperly built,
- or misconfigured. This error can also be caused by malfunctioning hardware.
- We will try our best to scrape up some info that will hopefully help
- diagnose the problem, but since we have already crashed,
- something is definitely wrong and this may fail.
- key_buffer_size=8388608
- read_buffer_size=131072
- max_used_connections=51
- max_threads=5000
- thread_count=52
- connection_count=50
- It is possible that mysqld could use up to
- key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1992176 K bytes of memory
- Hope that's ok; if not, decrease some variables in the equation.
- Thread pointer: 0x65d2370
- Attempting backtrace. You can use the following information to find out
- where mysqld died. If you see no messages after this, something went
- terribly wrong...
- stack_bottom = 7ff723ba9e10 thread_stack 0x40000
- /usr/local/mysql/bin/mysqld(my_print_stacktrace+0x35)[0x90d8e5]
- /usr/local/mysql/bin/mysqld(handle_fatal_signal+0x3d8)[0x674078]
- /lib64/libpthread.so.0(+0xf790)[0x7ff823f89790]
- /lib64/libc.so.6(gsignal+0x35)[0x7ff822c34625]
- /lib64/libc.so.6(abort+0x175)[0x7ff822c35e05]
- /usr/local/mysql/bin/mysqld[0xa29954]
- /usr/local/mysql/bin/mysqld[0x9cfd85]
- /usr/local/mysql/bin/mysqld[0x931341]
- /usr/local/mysql/bin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x64)[0x58c2d4]
- /usr/local/mysql/bin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x37)[0x83ebc7]
- /usr/local/mysql/bin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x181)[0x6d2691]
- /usr/local/mysql/bin/mysqld(_ZN4JOIN4execEv+0x391)[0x6d04b1]
- /usr/local/mysql/bin/mysqld[0x717329]
- /usr/local/mysql/bin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xbc)[0x717dec]
- /usr/local/mysql/bin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x175)[0x717ff5]
- /usr/local/mysql/bin/mysqld[0x6f26e9]
- /usr/local/mysql/bin/mysqld(_Z21mysql_execute_commandP3THD+0x3574)[0x6f6f14]
- /usr/local/mysql/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x338)[0x6faa48]
- /usr/local/mysql/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xc68)[0x6fbe28]
- /usr/local/mysql/bin/mysqld(_Z10do_commandP3THD+0xd7)[0x6fdc37]
- /usr/local/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x116)[0x6c4bb6]
- /usr/local/mysql/bin/mysqld(handle_one_connection+0x45)[0x6c4c95]
- /usr/local/mysql/bin/mysqld(pfs_spawn_thread+0x126)[0xaf92f6]
- /lib64/libpthread.so.0(+0x7a51)[0x7ff823f81a51]
- /lib64/libc.so.6(clone+0x6d)[0x7ff822cea93d]
- Trying to get some variables.
- Some pointers may be invalid and cause the dump to abort.
- Query (7ff6d8004df0): is an invalid pointer
- Connection ID (thread ID): 7
- Status: NOT_KILLED
- The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
- information that should help you find out what is causing the crash.
- 151203 14:13:15 mysqld_safe Number of processes running now: 0
- 151203 14:13:15 mysqld_safe mysqld restarted
- 2015-12-03 14:13:16 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
- 2015-12-03 14:13:16 0 [Note] /usr/local/mysql/bin/mysqld (mysqld 5.6.26-log) starting as process 2481 ...
- 2015-12-03 14:13:16 2481 [Note] Plugin 'FEDERATED' is disabled.
- 2015-12-03 14:13:16 7fec16233720 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
- 2015-12-03 14:13:16 2481 [Note] InnoDB: Using atomics to ref count buffer pool pages
- 2015-12-03 14:13:16 2481 [Note] InnoDB: The InnoDB memory heap is disabled
- 2015-12-03 14:13:16 2481 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
- 2015-12-03 14:13:16 2481 [Note] InnoDB: Memory barrier is not used
- 2015-12-03 14:13:16 2481 [Note] InnoDB: Compressed tables use zlib 1.2.3
- 2015-12-03 14:13:16 2481 [Note] InnoDB: Using Linux native AIO
- 2015-12-03 14:13:16 2481 [Note] InnoDB: Using CPU crc32 instructions
- 2015-12-03 14:13:16 2481 [Note] InnoDB: Initializing buffer pool, size = 2.5G
- 2015-12-03 14:13:17 2481 [Note] InnoDB: Completed initialization of buffer pool
- 2015-12-03 14:13:17 2481 [Note] InnoDB: Highest supported file format is Barracuda.
- 2015-12-03 14:13:17 2481 [Note] InnoDB: The log sequence numbers 20043317101 and 20043317101 in ibdata files do not match the log sequence number 20612097468 in the ib_logfiles!
- 2015-12-03 14:13:17 2481 [Note] InnoDB: Database was not shutdown normally!
- 2015-12-03 14:13:17 2481 [Note] InnoDB: Starting crash recovery.
- 2015-12-03 14:13:17 2481 [Note] InnoDB: Reading tablespace information from the .ibd files...
- 2015-12-03 14:13:17 2481 [Note] InnoDB: Restoring possible half-written data pages
- 2015-12-03 14:13:17 2481 [Note] InnoDB: from the doublewrite buffer...
- InnoDB: Last MySQL binlog file position 0 366526, file name mysql-bin.000037
- 2015-12-03 14:13:17 2481 [Note] InnoDB: 128 rollback segment(s) are active.
- 2015-12-03 14:13:17 2481 [Note] InnoDB: Waiting for purge to start
- 2015-12-03 14:13:17 2481 [Note] InnoDB: 5.6.26 started; log sequence number 20612097468
- 2015-12-03 14:13:17 2481 [Note] Recovering after a crash using /data/mysql3306/logs/mysql-bin
- 2015-12-03 14:13:17 2481 [Note] Starting crash recovery...
- 2015-12-03 14:13:17 2481 [Note] Crash recovery finished.
- 2015-12-03 14:13:17 2481 [Note] Server hostname (bind-address): '*'; port: 3306
- 2015-12-03 14:13:17 2481 [Note] IPv6 is available.
- 2015-12-03 14:13:17 2481 [Note] - '::' resolves to '::';
- 2015-12-03 14:13:17 2481 [Note] Server socket created on IP: '::'.
- 2015-12-03 14:13:17 2481 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
- 2015-12-03 14:13:17 2481 [Note] Event Scheduler: Loaded 0 events
- 2015-12-03 14:13:17 2481 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
- Version: '5.6.26-log' socket: '/data/mysql3306/mysql.sock' port: 3306 MySQL Community Server (GPL)
- 2015-12-03 14:13:17 2481 [Note] Slave I/O thread: connected to master 'replication@192.168.30.218:3306',replication started in log 'mysql-bin.000020' at position 409891
- 2015-12-03 14:13:17 2481 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0
- 2015-12-03 14:13:17 2481 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000020' at position 409891, relay log './mysqld-relay.000105' position: 283
主要注意其中几行:
InnoDB: Assertion failure in thread 140699433084672 in file btr0pcur.cc line 430 InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page) InnoDB: We intentionally generate a memory trap.
我们只需要简单的了解一些源码相关的知识,就可以判断此类的问题。
相关知识
MySQL源码目录结构
源码目录结构请参考彭立勋的PPThttp://ofslides.com/plinux-6880/presentation-36798
故障深入分析
大致了解了mysql源码的目录结构,我们再来看报错的那个文件:btr0pcur.cc。
注释显示这是个和索引树的游标相关的文件。
430行所处的函数是btr_pcur_move_to_next_page()。这个函数的功能是: 移动持续的游标到下一条记录。
这时我们就不难理解为什么服务器掉电后,读取数据会发生崩溃。就是因为游标读取索引时遇到了损坏的数据块。
故障排除
确定了问题所在之后,我们就可以开始修复了。
1.在my.cnf中[mysqld]下添加innodb_force_recovery=1,然后重启mysql服务。
2.尝试使用mysqldump备份所有数据。如果备份失败,逐个增加innodb_force_recovery的值(最大为6)后再尝试备份。
3.将备份的数据在别处还原,并对还原的数据进行校验。
4.原mysql实例建议重建数据目录。因为服务器掉电不仅会损坏ibd文件,共享表空间ibdata和日志也可能有损坏。
注:
也可以使用创建MyISAM表的方式导出损坏的表的数据。但mysqldump是首选。
CREATE TABLE new_table ... ENGINE=MYISAM;
INSERT (IGNORE) INTO new_table SELECT * FROM old_table;
故障总结
1.要定期对数据进行备份
2.缓存的电池也要做好保养
3.不要轻易对mysqld进程做kill操作