关于MYSQL 5.6 super_read_only和Event Scheduler冲突导致启动失败

本文涉及的产品
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS PostgreSQL,集群系列 2核4GB
简介: 关于MYSQL 5.6 super_read_only和Event Scheduler冲突导致启动失败 版本percona-server-5.6.22-72.0,这个版本肯定有这个问题 这个问题出现在线上我们将库设置为super_read_only后启动报错,整个...
关于MYSQL 5.6 super_read_only和Event Scheduler冲突导致启动失败

版本percona-server-5.6.22-72.0,这个版本肯定有这个问题
这个问题出现在线上我们将库设置为super_read_only后启动报错,整个MYSQLD CRASH掉
2017-04-23 01:15:46 22577 [ERROR] Event Scheduler: Failed to open table mysql.event
2017-04-23 01:15:46 22577 [ERROR] Event Scheduler: Error while loading from disk.
2017-04-23 01:15:46 22577 [Note] Event Scheduler: Purging the queue. 0 events
2017-04-23 01:15:46 22577 [ERROR] Aborting
在5.6.25以及其他5.7的版本上并没有这个问题。网上说可能是BUG但是没有说出具体原因,也有
帖子说和super_read_only 有关,确实关闭super_read_only就不会再出问题,
但是为了找到问题,我想挖一挖代码如下:
首先要找到报错源头,首先找到Event Scheduler: Error while loading from disk.的位置如下:
这段代码出现在
events.cc的Events::init(my_bool opt_noacl_or_bootstrap) 方法下
 if (event_queue->init_queue(thd) || load_events_from_db(thd) ||
      (opt_event_scheduler == EVENTS_ON && scheduler->start(&err_no)))
  {
    sql_print_error("Event Scheduler: Error while loading from disk."); --这里
    res= TRUE; /* fatal error: request unireg_abort */
    goto end;
  }
这里3个条件都可能引起这个错误:
1、初始化队列失败,函数接口event_queue->init_queue
2、加载event数据失败,函数接口load_events_from_db
3、参数event_scheduler设置是否为ON,并且scheduler->start调用失败


而错误 Event Scheduler: Failed to open table mysql.event 正是load_events_from_db(thd)这个
方法报出来的,查看其部分代码
/*
    NOTE: even if we run in read-only mode, we should be able to lock the
    mysql.event table for writing. In order to achieve this, we should call
    mysql_lock_tables() under the super user.


    Same goes for transaction access mode.
    Temporarily reset it to read-write.
  */--这里源码也有不算清楚的解释


  saved_master_access= thd->security_ctx->master_access;
  thd->security_ctx->master_access |= SUPER_ACL;
  bool save_tx_read_only= thd->tx_read_only;
  thd->tx_read_only= false;


  ret= db_repository->open_event_table(thd, TL_WRITE, &table); --这里的返回值进行判断


  thd->tx_read_only= save_tx_read_only;
  thd->security_ctx->master_access= saved_master_access;


  if (ret)
  {
    sql_print_error("Event Scheduler: Failed to open table mysql.event"); ---这里
    DBUG_RETURN(TRUE);
  }
我们可以看到他的是通过调用db_repository->open_event_table(thd, TL_WRITE, &table)来
接收其返回值如果为true则报错。接下来看open_event_table
bool
Event_db_repository::open_event_table(THD *thd, enum thr_lock_type lock_type,
                                      TABLE **table)
{
  TABLE_LIST tables;
  DBUG_ENTER("Event_db_repository::open_event_table");


  tables.init_one_table("mysql", 5, "event", 5, "event", lock_type);


  if (open_and_lock_tables(thd, &tables, FALSE, MYSQL_LOCK_IGNORE_TIMEOUT))
    DBUG_RETURN(TRUE);


  *table= tables.table;
  tables.table->use_all_columns();


  if (table_intact.check(*table, &event_table_def))
  {
    close_thread_tables(thd);
    my_error(ER_EVENT_OPEN_TABLE_FAILED, MYF(0));
    DBUG_RETURN(TRUE);
  }


  DBUG_RETURN(FALSE);
}
大概这个函数会判断
1、event表是否可以lock
2、event表是否损坏


最终会调用lock_tables-->mysql_lock_tables,如果mysql_lock_locks返回一个
NULL指针则报错如下:


    DEBUG_SYNC(thd, "before_lock_tables_takes_lock");


    if (! (thd->lock= mysql_lock_tables(thd, start, (uint) (ptr - start), flags))) --如果mysql_lock_tables返回一个NULL给thd->lock,产生异常
      DBUG_RETURN(TRUE);


    DEBUG_SYNC(thd, "after_lock_tables_takes_lock");


接下来mysql_lock_tables 又调用lock_tables_check函数进行table
lock的检查,如果lock_tables_check函数返回一个大于0的值则异常
那么调用流程清晰了
Events::init-->Events::load_events_from_db-->open_event_table
-->open_and_lock_tables -->lock_tables-->mysql_lock_tables-->
lock_tables_check


最终我们分析掉这个版本的BUG由于lock_tables_check函数检查返回异常
而导致这个错误。
函数调用栈如下:
#0  mysql_lock_tables (thd=0x1c0b5e0, tables=0x1b62ca0, count=1, flags=2048) at /home/percona-server-5.6.22-72.0/sql/lock.cc:296
#1  0x00000000007910c9 in lock_tables (thd=0x1c0b5e0, tables=0x7fffffffdae0, count=1, flags=2048) at /home/percona-server-5.6.22-72.0/sql/sql_base.cc:6125
#2  0x000000000079086f in open_and_lock_tables (thd=0x1c0b5e0, tables=0x7fffffffdae0, derived=false, flags=2048, prelocking_strategy=0x7fffffffda90)
    at /home/percona-server-5.6.22-72.0/sql/sql_base.cc:5889
#3  0x0000000000781ed2 in open_and_lock_tables (thd=0x1c0b5e0, tables=0x7fffffffdae0, derived=false, flags=2048)
    at /home/percona-server-5.6.22-72.0/sql/sql_base.h:477
#4  0x0000000000a26d20 in Event_db_repository::check_system_tables (thd=0x1c0b5e0) at /home/percona-server-5.6.22-72.0/sql/event_db_repository.cc:1202
#5  0x00000000008ff5fb in Events::init (opt_noacl_or_bootstrap=0 '\000') at /home/percona-server-5.6.22-72.0/sql/events.cc:858
#6  0x000000000063e21d in mysqld_main (argc=83, argv=0x18f4c58) at /home/percona-server-5.6.22-72.0/sql/mysqld.cc:5784
#7  0x0000000000632634 in main (argc=11, argv=0x7fffffffe398) at /home/percona-server-5.6.22-72.0/sql/main.cc:25

跟踪lock_tables_check函数发现如下


198         if (!(flags & MYSQL_LOCK_IGNORE_GLOBAL_READ_ONLY) && !t->s->tmp_table)
(gdb) n
200           if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE &&
(gdb) n
204                      opt_super_readonly ? "--read-only (super)" : "--read-only");
(gdb) n
205             DBUG_RETURN(1); 
上面由于这段代码:
200           if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE &&
201               enforce_ro && opt_readonly && !thd->slave_thread)
202           {
203             my_error(ER_OPTION_PREVENTS_STATEMENT, MYF(0),
204                      opt_super_readonly ? "--read-only (super)" : "--read-only");
205             DBUG_RETURN(1);
206           }

if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE && enforce_ro && opt_readonly && !thd->slave_thread)
条件满足返回了 DBUG_RETURN(1); 造成整个报错
而这里
enforce_ro 起到了关键作用,也是问题的根源,相信和这里有关
  if (!opt_super_readonly)
    enforce_ro= !(thd->security_ctx->master_access & SUPER_ACL);

(gdb) p  t->alias
$12 = 0x1c5b4d0 "event"
(gdb) list
200           if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE &&
201               enforce_ro && opt_readonly && !thd->slave_thread)
202           {
203             my_error(ER_OPTION_PREVENTS_STATEMENT, MYF(0),
204                      opt_super_readonly ? "--read-only (super)" : "--read-only");
205             DBUG_RETURN(1);
206           }
207         }
208       }
209
(gdb) p enforce_ro
$13 = true
(gdb) p  t->alias
$15 = 0x1c5b4d0 "event"
(gdb) p thd->security_ctx->master_access
$16 = 32768

我们研究下 
lock_tables_check(THD *thd, TABLE **tables, uint count, uint flags)
{
  uint system_count= 0, i= 0;
  bool enforce_ro= true;
  /*
    Identifies if the executed sql command can updated either a log
    or rpl info table.
  */
  bool log_table_write_query= false;

  DBUG_ENTER("lock_tables_check");

  if (!opt_super_readonly)
    enforce_ro= !(thd->security_ctx->master_access & SUPER_ACL);
 后面省略

这句问题的根源,默认enforce_ro= true; 
如果opt_super_readonly开启则
enforce_ro= !(thd->security_ctx->master_access & SUPER_ACL);不运行
那么 enforce_ro= true

如果opt_super_readonly不开启则
enforce_ro= !(thd->security_ctx->master_access & SUPER_ACL);运行
SUPER_ACL是一个宏#define SUPER_ACL (1L << 15)
当做4字节INT的话,及
1000 0000 0000 0000
而thd->security_ctx->master_access是32768及
1000 0000 0000 0000
显然他们按位与得到是1 及ture,然后!true
所以enforce_ro= false;
如果为flase则
200           if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE &&
201               enforce_ro && opt_readonly && !thd->slave_thread)
202           {
203             my_error(ER_OPTION_PREVENTS_STATEMENT, MYF(0),
204                      opt_super_readonly ? "--read-only (super)" : "--read-only");
205             DBUG_RETURN(1);
206           }
不会执行,则不会再次报错

所以解决这个问题或者说BUG,就是设置如果opt_super_readonly不开启,
就是不要设置super_read_only权限。

在5.7.14版本中,我粗略查看 lock_tables_check代码,有改动。5.7.14没遇到这个 问题
  1. if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE &&
  2.         check_readonly(thd, true))
  3.           DBUG_RETURN(1);

可以看到  enforce_ro已经不再作为判断的标准


而5.6.22这个版本确实有这个问题,但是这个问题不是每个版本都有。如果遇到可以参考。

附带5.6.22,5.7.14代码:
5.6.22

点击(此处)折叠或打开

  1. static int
  2. lock_tables_check(THD *thd, TABLE **tables, uint count, uint flags)
  3. {
  4.   uint system_count= 0, i= 0;
  5.   bool enforce_ro= true;
  6.   /*
  7.     Identifies if the executed sql command can updated either a log
  8.     or rpl info table.
  9.   */
  10.   bool log_table_write_query= false;

  11.   DBUG_ENTER("lock_tables_check");

  12.   if (!opt_super_readonly)
  13.     enforce_ro= !(thd->security_ctx->master_access & SUPER_ACL);
  14.   log_table_write_query=
  15.      is_log_table_write_query(thd->lex->sql_command);

  16.   for (i=0 ; i<count; i++)
  17.   {
  18.     TABLE *t= tables[i];

  19.     /* Protect against 'fake' partially initialized TABLE_SHARE */
  20.     DBUG_ASSERT(t->s->table_category != TABLE_UNKNOWN_CATEGORY);

  21.     /*
  22.       Table I/O to performance schema tables is performed
  23.       only internally by the server implementation.
  24.       When a user is requesting a lock, the following
  25.       constraints are enforced:
  26.     */
  27.     if (t->s->table_category == TABLE_CATEGORY_LOG &&
  28.         (flags & MYSQL_LOCK_LOG_TABLE) == 0 &&
  29.         !log_table_write_query)
  30.     {
  31.       /*
  32.         A user should not be able to prevent writes,
  33.         or hold any type of lock in a session,
  34.         since this would be a DOS attack.
  35.       */
  36.       if (t->reginfo.lock_type >= TL_READ_NO_INSERT ||
  37.           thd->lex->sql_command == SQLCOM_LOCK_TABLES)
  38.       {
  39.         my_error(ER_CANT_LOCK_LOG_TABLE, MYF(0));
  40.         DBUG_RETURN(1);
  41.       }
  42.     }

  43.     if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE)
  44.     {
  45.       if (t->s->table_category == TABLE_CATEGORY_SYSTEM)
  46.         system_count++;

  47.       if (t->db_stat & HA_READ_ONLY)
  48.       {
  49.         my_error(ER_OPEN_AS_READONLY, MYF(0), t->alias);
  50.         DBUG_RETURN(1);
  51.       }
  52.     }

  53.     /*
  54.       If we are going to lock a non-temporary table we must own metadata
  55.       lock of appropriate type on it (I.e. for table to be locked for
  56.       write we must own metadata lock of MDL_SHARED_WRITE or stronger
  57.       type. For table to be locked for read we must own metadata lock
  58.       of MDL_SHARED_READ or stronger type).
  59.       The only exception are HANDLER statements which are allowed to
  60.       lock table for read while having only MDL_SHARED lock on it.
  61.     */
  62.     DBUG_ASSERT(t->s->tmp_table ||
  63.                 thd->mdl_context.is_lock_owner(MDL_key::TABLE,
  64.                                  t->s->db.str, t->s->table_name.str,
  65.                                  t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE ?
  66.                                  MDL_SHARED_WRITE : MDL_SHARED_READ) ||
  67.                 (t->open_by_handler &&
  68.                  thd->mdl_context.is_lock_owner(MDL_key::TABLE,
  69.                                   t->s->db.str, t->s->table_name.str,
  70.                                   MDL_SHARED)));

  71.     /*
  72.       Prevent modifications to base tables if READ_ONLY is activated.
  73.       In any case, read only does not apply to temporary tables.
  74.     */
  75.     if (!(flags & MYSQL_LOCK_IGNORE_GLOBAL_READ_ONLY) && !t->s->tmp_table)
  76.     {
  77.       if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE &&
  78.           enforce_ro && opt_readonly && !thd->slave_thread)
  79.       {
  80.         my_error(ER_OPTION_PREVENTS_STATEMENT, MYF(0),
  81.                  opt_super_readonly ? "--read-only (super)" : "--read-only");
  82.         DBUG_RETURN(1);
  83.       }
  84.     }
  85.   }

  86.   /*
  87.     Locking of system tables is restricted:
  88.     locking a mix of system and non-system tables in the same lock
  89.     is prohibited, to prevent contention.
  90.   */
  91.   if ((system_count > 0) && (system_count < count))
  92.   {
  93.     my_error(ER_WRONG_LOCK_OF_SYSTEM_TABLE, MYF(0));
  94.     DBUG_RETURN(1);
  95.   }

  96.   DBUG_RETURN(0);
  97. }




5.7.14

点击(此处)折叠或打开

  1. static int
  2. lock_tables_check(THD *thd, TABLE **tables, size_t count, uint flags)
  3. {
  4.   uint system_count= 0, i= 0;
  5.   /*
  6.     Identifies if the executed sql command can updated either a log
  7.     or rpl info table.
  8.   */
  9.   bool log_table_write_query= false;

  10.   DBUG_ENTER("lock_tables_check");

  11.   log_table_write_query=
  12.      is_log_table_write_query(thd->lex->sql_command);

  13.   for (i=0 ; i<count; i++)
  14.   {
  15.     TABLE *t= tables[i];

  16.     /* Protect against 'fake' partially initialized TABLE_SHARE */
  17.     DBUG_ASSERT(t->s->table_category != TABLE_UNKNOWN_CATEGORY);

  18.     /*
  19.       Table I/O to performance schema tables is performed
  20.       only internally by the server implementation.
  21.       When a user is requesting a lock, the following
  22.       constraints are enforced:
  23.     */
  24.     if (t->s->table_category == TABLE_CATEGORY_LOG &&
  25.         (flags & MYSQL_LOCK_LOG_TABLE) == 0 &&
  26.         !log_table_write_query)
  27.     {
  28.       /*
  29.         A user should not be able to prevent writes,
  30.         or hold any type of lock in a session,
  31.         since this would be a DOS attack.
  32.       */
  33.       if (t->reginfo.lock_type >= TL_READ_NO_INSERT ||
  34.           thd->lex->sql_command == SQLCOM_LOCK_TABLES)
  35.       {
  36.         my_error(ER_CANT_LOCK_LOG_TABLE, MYF(0));
  37.         DBUG_RETURN(1);
  38.       }
  39.     }

  40.     if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE)
  41.     {
  42.       if (t->s->table_category == TABLE_CATEGORY_SYSTEM)
  43.         system_count++;

  44.       if (t->db_stat & HA_READ_ONLY)
  45.       {
  46.         my_error(ER_OPEN_AS_READONLY, MYF(0), t->alias);
  47.         DBUG_RETURN(1);
  48.       }
  49.     }

  50.     /*
  51.       If we are going to lock a non-temporary table we must own metadata
  52.       lock of appropriate type on it (I.e. for table to be locked for
  53.       write we must own metadata lock of MDL_SHARED_WRITE or stronger
  54.       type. For table to be locked for read we must own metadata lock
  55.       of MDL_SHARED_READ or stronger type).
  56.     */
  57.     DBUG_ASSERT(t->s->tmp_table ||
  58.                 thd->mdl_context.owns_equal_or_stronger_lock(MDL_key::TABLE,
  59.                                    t->s->db.str, t->s->table_name.str,
  60.                                    t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE ?
  61.                                    MDL_SHARED_WRITE : MDL_SHARED_READ));

  62.     /*
  63.       Prevent modifications to base tables if READ_ONLY is activated.
  64.       In any case, read only does not apply to temporary tables.
  65.     */
  66.     if (!(flags & MYSQL_LOCK_IGNORE_GLOBAL_READ_ONLY) && !t->s->tmp_table)
  67.     {
  68.       if (t->reginfo.lock_type >= TL_WRITE_ALLOW_WRITE &&
  69.         check_readonly(thd, true))
  70.           DBUG_RETURN(1);
  71.     }
  72.   }

  73.   /*
  74.     Locking of system tables is restricted:
  75.     locking a mix of system and non-system tables in the same lock
  76.     is prohibited, to prevent contention.
  77.   */
  78.   if ((system_count > 0) && (system_count < count))
  79.   {
  80.     my_error(ER_WRONG_LOCK_OF_SYSTEM_TABLE, MYF(0));
  81.     DBUG_RETURN(1);
  82.   }

  83.   DBUG_RETURN(0);
  84. }

作者微信:

               


相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助 &nbsp; &nbsp; 相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
相关文章
|
7月前
|
关系型数据库 MySQL Linux
【Linux】安装Mysql(附加service mysql启动失败 提示unit not found 解决办法)
【Linux】安装Mysql(附加service mysql启动失败 提示unit not found 解决办法)
524 0
|
SQL 监控 关系型数据库
Mysql主从同步报错解决:Error executing row event: Table zabbix.history-..
Mysql主从同步报错解决:Error executing row event: Table zabbix.history-..
318 0
|
关系型数据库 MySQL
MySql插入唯一键冲突的三种可选方式
MySql插入一条记录,结果提示主键冲突,怎么办? 批量插入数据时,发现插入的这批数据中,有某些记录存在唯一键冲突,一个一个跳出来就比较麻烦了,有什么好的办法直接忽略掉冲突的记录么? 下面简单记录三种处理方式
702 0
MySql插入唯一键冲突的三种可选方式
|
关系型数据库 MySQL Linux
MySQL 启动失败的常见原因---发表到爱可生开源社区
MySQL 启动失败的最常见的原因有两类,分别是无法访问系统资源和参数设置错误造成的,下面分别分析如下。
315 0
|
关系型数据库 MySQL Apache
|
关系型数据库 MySQL 测试技术
mysql缺少binlog启动失败
因为磁盘满了,就直接把binlog都统统给删了(通常会留最后几个,因为直觉告诉我,都删了肯定有问题),今天没注意都给删了,然后起msyql时就起不来了。
316 0
mysql缺少binlog启动失败
|
关系型数据库 MySQL Linux
linux系统mysql服务启动失败
linux系统mysql服务启动失败
linux系统mysql服务启动失败
|
SQL 存储 关系型数据库
MySQL基本的SELECT语句,SQL语言概述,基础和重要不冲突,完整详细可收藏
MySQL基本的SELECT语句,SQL语言概述,基础和重要不冲突,完整详细可收藏
194 0
MySQL基本的SELECT语句,SQL语言概述,基础和重要不冲突,完整详细可收藏
|
安全 关系型数据库 MySQL
【Docker 基础教程】Mysql主从服务搭建------Mysql容器闪退及容器名冲突系列问题
【Docker 基础教程】Mysql主从服务搭建------Mysql容器闪退及容器名冲突系列问题
【Docker 基础教程】Mysql主从服务搭建------Mysql容器闪退及容器名冲突系列问题