ul 3 20:41:24 yz384 kernel:
Jul 3 20:43:24 yz384 kernel: INFO: task chown:18647 blocked for more than 120 seconds.
Jul 3 20:43:24 yz384 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 3 20:43:24 yz384 kernel: chown D ffffffff80157c4c 0 18647 1 24429 22803 (NOTLB)
Jul 3 20:43:24 yz384 kernel: ffff81015ed4fde8 0000000000000086 ffff810c3fc1d3c0 ffff810e91a379c0
Jul 3 20:43:24 yz384 kernel: 0000000000000000 0000000000000008 ffff8103bf82f7a0 ffff810c3fd2e7a0
Jul 3 20:43:24 yz384 kernel: 000a29c122906514 000000000000141d ffff8103bf82f988 0000000a00000001
Jul 3 20:43:24 yz384 kernel: Call Trace:
Jul 3 20:43:24 yz384 kernel: [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Jul 3 20:43:24 yz384 kernel: [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Jul 3 20:43:24 yz384 kernel: [<ffffffff8003b7e5>] chown_common+0x90/0xb0
Jul 3 20:43:24 yz384 kernel: [<ffffffff80023fa0>] __user_walk_fd+0x41/0x4c
Jul 3 20:43:24 yz384 kernel: [<ffffffff800e3c44>] sys_lchown+0x38/0x53
Jul 3 20:43:24 yz384 kernel: [<ffffffff8000d57f>] dput+0x2c/0x114
Jul 3 20:43:24 yz384 kernel: [<ffffffff80012d27>] __fput+0x191/0x1bd
Jul 3 20:43:24 yz384 kernel: [<ffffffff8002d0f1>] mntput_no_expire+0x19/0x89
Jul 3 20:43:24 yz384 kernel: [<ffffffff80024200>] filp_close+0x5c/0x64
Jul 3 20:43:24 yz384 kernel: [<ffffffff8005d116>] system_call+0x7e/0x83
Jul 3 20:43:24 yz384 kernel:
The warning is given to indicate a problem with the system. In my experience it means that the process is blocked in kernel space for at least 120 seconds usually because the process is starved of disk I/O. This can be because of heavy swapping due to too much memory being used, e.g. if you have a heavy webserver load and you've configured too many apache child processes for your system. In your case it may just be that there are too many mysql processes competing for memory and data IO.
It can also happen if the underlying storage system is not performing well, e.g. if you have a SAN which is overloaded, or if there are soft errors on a disk which cause a lot of retries. Whenever a task has to wait long for its IO commands to complete, these warning may be issued.
This is a know bug. By default Linux uses up to 40% of the available memory for file system caching. After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous. For flushing out this data to disk this there is a time limit of 120 seconds by default. In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds. This especially happens on systems with a lof of memory.
The problem is solved in later kernels and there is not “fix” from Oracle. I fixed this by lowering the mark for flushing the cache from 40% to 10% by setting “vm.dirty_ratio=10” in /etc/sysctl.conf. This setting does not influence overall database performance since you hopefully use Direct IO and bypass the file system cache completely.
原理:linux会设置40%的可用内存用来做系统cache,当flush数据时这40%内存中的数据由于和IO同步问题导致超时(120s),所将40%减小到10%,避免超时
在文件/etc/sysctl.conf中加入 vm.dirty_ratio=10
进程等待IO时,经常处于D状态,即TASK_UNINTERRUPTIBLE状态,处于这种状态的进程不处理信号,所以kill不掉,如果进程长期处于D状态,那么肯定不正常,原因可能有二:
1)IO路径上的硬件出问题了,比如硬盘坏了(只有少数情况会导致长期D,通常会返回错误);
2)内核自己出问题了。
这种问题一旦出现就通常不可恢复,kill不掉,通常只能重启恢复了。
内核针对这种开发了一种hung task的检测机制,基本原理是:定时检测系统中处于D状态的进程,如果其处于D状态的时间超过了指定时间(默认120s,可以配置),则打印相关堆栈信息,也可以通过proc参数配置使其直接panic。