MySQL案例-不同寻常的[ERROR]Can't create a new thread (errno 11)

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 高可用系列,价值2615元额度,1个月
简介: -------------------------------------------------------------------------------------------------正文-----------------------------------...
-------------------------------------------------------------------------------------------------正文---------------------------------------------------------------------------------------------------------------

场景:
MySQL-5.7.17, 程序端报异常

点击(此处)折叠或打开

  1. OperationalError: (1135, "Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug")

结论:
肯定不是files open limit和innodb_open_files的问题~
PS: 是的话, 就没有这篇博客了~
先卖个关子~\(≧▽≦)/~


不同寻常的地方:
程序在创建约 32300+的数据库连接之后, 必定会出现连接异常, 清理掉部分连接以后, 会恢复正常, 但是再次到达32300+的数量之后, 问题还是会出现;

在测试环境必现, 5.7.17和5.7.19都有这个问题;



分析:
首先考虑到的就是 files open  limit和innodb_open_files的问题, 但是试过了, 都没有用 ;

把内存相关的设置, files open之类的设置进行调整之后, 这个问题依然没有解决, 感觉问题可能并没有出在MySQL, 难道是系统层面的一些限制or bug?

遂编译了5.7.19版本的MySQL, 打开了debug, 并写了一个简单的python脚本来hold 32300+的数据库连接;


点击(此处)折叠或打开

  1. import MySQLdb
  2. import sys
  3. import time

  4. loop = 10000
  5. conn_list = []

  6. def my_conn(ip) :
  7.     return MySQLdb.connect(host=ip
  8.             ,port=3306
  9.             ,user='temp'
  10.             ,passwd='test')

  11. def conn_test(ip) :
  12.     for i in range(1,loop) :
  13.         conn = my_conn(ip)
  14.         conn_list.append(conn)
  15.     num = 0
  16.     while(True) :
  17.         print num
  18.         if num == loop - 1 :
  19.             num = 0
  20.             time.sleep(10)
  21.         num = num + 1
  22.         time.sleep(1)
  23.     print rst
  24.     return True


  25. if __name__ == '__main__' :
  26.     conn_test("192.168.1.1")

多次尝试之下, 确认在创建到第32373个连接时一定会报错, 那么看一下mysql trace:

这是出问题的时候的信息:

点击(此处)折叠或打开

  1. T@0: >Per_thread_connection_handler::add_connection
  2. T@0: | >my_raw_malloc
  3. T@0: | | my: size: 232 my_flags: 16
  4. T@0: | | exit: ptr: 0x44046ec0
  5. T@0: | <my_raw_malloc 219
  6. T@0: | >my_free
  7. T@0: | | my: ptr: 0x44046ec0
  8. T@0: | <my_free 292
  9. T@0: >Per_thread_connection_handler::add_connection

这是正常的时候:

点击(此处)折叠或打开

  1. T@0: >Per_thread_connection_handler::add_connection
  2. T@0: | >my_raw_malloc
  3. T@0: | | my: size: 232 my_flags: 16
  4. T@0: | | exit: ptr: 0x4238d9c0
  5. T@0: | <my_raw_malloc 219
  6. T@0: | info: Thread created
  7. T@0: <Per_thread_connection_handler::add_connection 425

那么确实如错误信息描述一般, mysql在创建新连接的时候遇到了问题,
具体的来说, 是在申请完mysql创建connection需要的内存之后, 发生了问题, 所以释放掉了这一部分内存, 并抛出异常;
那么看看在这个方法里面, mysql在干嘛:

点击(此处)折叠或打开

  1. connection_handler_per_thread.cc

  2. bool Per_thread_connection_handler::add_connection(Channel_info* channel_info)
  3. {
  4.   int error= 0;
  5.   my_thread_handle id;

  6.   DBUG_ENTER("Per_thread_connection_handler::add_connection");

  7.   // Simulate thread creation for test case before we check thread cache
  8.   DBUG_EXECUTE_IF("fail_thread_create", error= 1; goto handle_error;);

  9.   if (!check_idle_thread_and_enqueue_connection(channel_info))
  10.     DBUG_RETURN(false);

  11.   /*
  12.     There are no idle threads avaliable to take up the new
  13.     connection. Create a new thread to handle the connection
  14.   */
  15.   channel_info->set_prior_thr_create_utime();
  16.   error= mysql_thread_create(key_thread_one_connection, &id,     //<----在这里, error不是0
  17.                              &connection_attrib,
  18.                              handle_connection,
  19.                              (void*) channel_info);
  20. #ifndef DBUG_OFF
  21. handle_error:
  22. #endif // !DBUG_OFF

  23.   if (error)                                                     //<----所以进入了这个if逻辑
  24.   {
  25.     connection_errors_internal++;
  26.     if (!create_thd_err_log_throttle.log())
  27.       sql_print_error("Can't create thread to handle new connection(errno= %d)",
  28.                       error);
  29.     channel_info->send_error_and_close_channel(ER_CANT_CREATE_THREAD,
  30.                                                error, true);
  31.     Connection_handler_manager::dec_connection_count();
  32.     DBUG_RETURN(true);
  33.   }

  34.   Global_THD_manager::get_instance()->inc_thread_created();
  35.   DBUG_PRINT("info",("Thread created"));
  36.   DBUG_RETURN(false);
  37. }


既然是 mysql_thread_create出了问题, 那继续往下追踪, 通过各种def的转换, 最终到了这段代码;
PS: trace中的my_free是一个很重要的信息, 通过这个信息可以确认到并不是MySQL自身的代码出现了问题~


点击(此处)折叠或打开

  1. my_thread.c

  2. int my_thread_create(my_thread_handle *thread, const my_thread_attr_t *attr,
  3.                      my_start_routine func, void *arg)
  4. {
  5. #ifndef _WIN32
  6.   return pthread_create(&thread->thread, attr, func, arg);
  7. #else
  8.   ......
  9. }

可以发现, 从add_connection开始, 一路调用各种方法, 最终error的返回值是由pthread_create决定的;

而出问题的这个方法, 其实是glibc的函数, 就算在gdb中进行调试, 也无法看到具体的代码, 如果hold住32000+连接后再用gdb调试, 那效率真是....
(╯‵□′)╯︵┻━┻

于是在google搜了一下pthread_create方法和Can't create a new thread的关键字, 找到一些信息, 大体上的说法就是一些Linux系统层面的参数会限制可创建的线程数;

似乎是有些眉目了, 于是仔细找了一圈, 发现一个比较早的讨论帖, 正好是在讨论不能创建32K连接数的问题;
相关链接:
https://listman.redhat.com/archives/phil-list/2003-August/msg00005.html
https://listman.redhat.com/archives/phil-list/2003-August/msg00010.html
https://listman.redhat.com/archives/phil-list/2003-August/msg00025.html

把讨论的内容贴过来:

点击(此处)折叠或打开

  1. Hi,
    
    I was using the 'thread-limit' program from http://people.redhat.com/alikins/tuning_utils/thread-limit.c to test the
    number of threads it could create.  It seems that it was always hitting
    some limit at 32K threads (cannot create thread 32762, to be exact). The
    error is ENOMEM.  Here's the kernel/ulimit settings,
    
    /proc/sys/kernel/pid_max 300000
    /proc/sys/kernel/threads-max 100000
    
    ulimit -a
    core file size        (blocks, -c) 0
    data seg size         (kbytes, -d) unlimited
    file size             (blocks, -f) unlimited
    max locked memory     (kbytes, -l) unlimited
    max memory size       (kbytes, -m) unlimited
    open files                    (-n) 100000
    pipe size          (512 bytes, -p) 8
    stack size            (kbytes, -s) 32
    cpu time             (seconds, -t) unlimited
    max user processes            (-u) 100000
    virtual memory        (kbytes, -v) unlimited
    
    It gave the same result on both a Debian 3 box with NPTL 0.56 compiled
    with gcc 3.4 CVS and GlibC CVS, kernel 2.5.70, and vanilla Redhat 9.
    
    I know I must be missing something because 100K threads with NPTL was
    reported.  Thanks.
    
    -- 
    Feng Zhou
    Graduate Student,
    CS Division, U.C. Berkeley 



点击(此处)折叠或打开

  1. 
    
    On Fri, 7 Aug 2003, Feng Zhou wrote:
    
    > I was using the 'thread-limit' program from
    > http://people.redhat.com/alikins/tuning_utils/thread-limit.c to test the
    > number of threads it could create.  It seems that it was always hitting
    > some limit at 32K threads (cannot create thread 32762, to be exact). The
    > error is ENOMEM.  Here's the kernel/ulimit settings,
    
    what is the current value of your /proc/sys/vm/max_map_count tunable? Can
    you max out RAM if you double the current limit?
    
    	Ingo


点击(此处)折叠或打开

  1. Yes, that's it.  Actually I have to change MAX_MAP_COUNT in
    include/linux/sched.h and recompile the 2.5.70 kernel because it doesn't
    have such a sysctl file.  After doubling the value from 65536 to 131072,
    I can create 65530 thread before it fails with ENOMEM.  
    
    BTW, the system begins thrashing at around 63000 threads, where resident
    set of the process is around 250MB.  This makes sense to me because each
    empty thread actually uses the first 4K page in its 16K stack.  Given
    the system has 1GB of physical memory.  The kernel memory each thread
    uses seems to be around 12KB ((1GB-250MB)/63000).
    
    - Feng Zhou
    
    On Mon, 2003-08-11 at 02:29, Ingo Molnar wrote:
    > On Fri, 7 Aug 2003, Feng Zhou wrote:
    > 
    > > I was using the 'thread-limit' program from
    > > http://people.redhat.com/alikins/tuning_utils/thread-limit.c to test the
    > > number of threads it could create.  It seems that it was always hitting
    > > some limit at 32K threads (cannot create thread 32762, to be exact). The
    > > error is ENOMEM.  Here's the kernel/ulimit settings,
    > 
    > what is the current value of your /proc/sys/vm/max_map_count tunable? Can
    > you max out RAM if you double the current limit?
    > 
    > 	Ingo
    > 

事实上, 就如这个讨论所言,  在调整了 max_map_count 的设置之后, mysql也可以创建超过32000+的连接了!

那么这个参数调高以后, 有什么影响呢?
这篇文章有提到这个参数的影响: https://www.novell.com/support/kb/doc.php?id=7000830

截取其中的重要部分:

点击(此处)折叠或打开

  1. How are they affected? Well, since there will be more elements in the VM red-black tree, all operations on the VMA will take longer. The slow-down of most operations is logarithmic, e.g. further mmap's, munmap's et al. as well as handling page faults (both major and minor). Some operations will slow down linearly, e.g. copying the VMAs when a new process is forked.

  2. In short, there is absolutely no impact on memory footprint or performance for processes which use the same number of maps. On the other hand, processes where one of the memory mapping functions would have failed with ENOMEM because of hitting the limit, will now be allowed to consume the additional kernel memory with all the implications described above.


结论:
所以调高vm. max_map_count之后, 程序就不会再抛出异常了~
从实际结果测试来看, 默认的65535能支持32000+的链接, 翻倍的话, 应该能支持双倍的链接上限





相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助 &nbsp; &nbsp; 相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
关系型数据库 MySQL Linux
mysql登录报错Can't create a new thread
mysql登录报错Can't create a new thread
230 0
|
6月前
|
SQL
启动mysq异常The server quit without updating PID file [FAILED]sql/data/***.pi根本解决方案
启动mysq异常The server quit without updating PID file [FAILED]sql/data/***.pi根本解决方案
63 0
|
关系型数据库 MySQL 数据库
MySQL“错误1005(HY000):无法创建表'foo#sql-12c_4'(errno:150)”
MySQL“错误1005(HY000):无法创建表'foo#sql-12c_4'(errno:150)”
198 0
|
关系型数据库 MySQL 数据库
mysql 常见错误:Can't create table... errno150原因分析
mysql 常见错误:Can't create table... errno150原因分析
319 0
|
关系型数据库 MySQL
MySQL:报错 ERROR 1055 (42000)sql_mode=only_full_group_by
MySQL:报错 ERROR 1055 (42000)sql_mode=only_full_group_by
227 0
|
存储 关系型数据库 MySQL
MySQL问题解决[Err] 1005 - Can't create table '.\ \#sql-b34_61.frm' (errno: 150)M
MySQL问题解决[Err] 1005 - Can't create table '.\ \#sql-b34_61.frm' (errno: 150)M
455 0
|
SQL 关系型数据库 MySQL
MySQL问题解决 [Err] 1064 - You have an error in your SQL syntax;
MySQL问题解决 [Err] 1064 - You have an error in your SQL syntax;
291 0
|
存储 关系型数据库 MySQL
|
关系型数据库 MySQL
MYSQL ERROR 1146 Table doesnt exist 解析
原创转载请注明出处 源码版本 5.7.14 在MYSQL使用innodb的时候我们有时候会看到如下报错: ERROR 1146 (42S02): Table 'test.test1bak' doesn't exist 首先总结下原因: 缺少frm文件 innodb数据字典不包含这个表 我们重点讨论情况2,因为情况1是显而易见的。
2941 0
|
关系型数据库 MySQL
MySQL 同时出现 ERROR 2002 和 ERROR 1524
ERROR 2002 (HY000) ERROR 1524 (HY000): Plugin 'unix_socket' is not loaded ERROR 1524 (HY000): Plugin 'unix_socket' is not loaded
1024 0
下一篇
无影云桌面