AliYunDun 触发内核 panic
最近收到几个 ECS 的 panic dump,巧的是都是 AliYunDun 触发的 panic,系统日志几乎都长这样:
[148804.194380] BUG: unable to handle kernel paging request at ffffc90001240ffd
[148804.195466] IP: [<ffffffff8132bd9d>] strnstr+0x3d/0x70
[148804.196215] PGD 13b05a067 PUD 13b05b067 PMD 136baa067 PTE 0
[148804.197098] Oops: 0000 [#1] SMP
[148804.197608] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack libcrc32c br_netfilter bridge stp llc overlay(T) tun edac_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ppdev ablk_helper cryptd virtio_balloon joydev parport_pc pcspkr parport i2c_piix4 ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_net virtio_blk virtio_console cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix crct10dif_pclmul crct10dif_common libata crc32c_intel serio_raw virtio_pci virtio_ring virtio i2c_core floppy
[148804.207639] CPU: 1 PID: 1487 Comm: AliYunDun Tainted: G OE ------------ T 3.10.0-693.2.2.el7.x86_64 #1
[148804.209067] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[148804.210751] task: ffff8800b2aa9fa0 ti: ffff8800b2b24000 task.ti: ffff8800b2b24000
[148804.211796] RIP: 0010:[<ffffffff8132bd9d>] [<ffffffff8132bd9d>] strnstr+0x3d/0x70
[148804.212870] RSP: 0018:ffff8800b2b27e30 EFLAGS: 00010246
[148804.213624] RAX: 0000000000000000 RBX: ffff880136e3e480 RCX: 0000000000000005
[148804.214632] RDX: 00000000000000b4 RSI: ffff8800b2b27e42 RDI: ffffc90001240ffd
[148804.215628] RBP: ffff8800b2b27e30 R08: 000000000000003a R09: 0000000000000001
[148804.216625] R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000000
[148804.217623] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880136e3e480
[148804.218612] FS: 00007f86eb336700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[148804.219730] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[148804.220540] CR2: ffffc90001240ffd CR3: 00000000b299e000 CR4: 00000000003406e0
[148804.221540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[148804.222545] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[148804.223556] Stack:
[148804.223856] ffff8800b2b27e70 ffffffffc02902a2 00344631303a0000 000000007b4e2e7f
[148804.224969] 0000000000000000 ffff88003685af00 ffff8800b2b27f48 ffff8800b9ab2a80
[148804.226094] ffff8800b2b27ee0 ffffffff8122649a 000000007b4e2e7f 0000000001828360
[148804.227210] Call Trace:
[148804.227606] [<ffffffff8122649a>] ? seq_read+0x10a/0x3b0
[148804.228359] [<ffffffff8127041d>] ? proc_reg_read+0x3d/0x80
[148804.229148] [<ffffffff81200bac>] ? vfs_read+0x9c/0x170
[148804.229885] [<ffffffff81201a6f>] ? SyS_read+0x7f/0xe0
[148804.230611] [<ffffffff816b5009>] ? system_call_fastpath+0x16/0x1b
[148804.231480] Code: c1 01 80 39 00 75 f7 48 29 f1 48 89 f8 74 2e 48 39 ca 72 3f 66 2e 0f 1f 84 00 00 00 00 00 31 c0 66 0f 1f 44 00 00 44 0f b6 04 06 <44> 38 04 07 75 15 48 83 c0 01 48 39 c1 75 ec 48 89 f8 5d c3 0f
[148804.235424] RIP [<ffffffff8132bd9d>] strnstr+0x3d/0x70
[148804.236180] RSP <ffff8800b2b27e30>
[148804.236681] CR2: ffffc90001240ffd
AliYunDun 是云安全中心的 Agent,几次相同的 panic 日志,不禁让客户和我们浮想联翩~难道是云安全中心出 Bug 了?
AliYunDun 在干什么?
PID: 1487 TASK: ffff8800b2aa9fa0 CPU: 1 COMMAND: "AliYunDun"
#0 [ffff8800b2b27ab0] machine_kexec at ffffffff8105c4cb
#1 [ffff8800b2b27b10] __crash_kexec at ffffffff81104a32
#2 [ffff8800b2b27be0] crash_kexec at ffffffff81104b20
#3 [ffff8800b2b27bf8] oops_end at ffffffff816ad2b8
#4 [ffff8800b2b27c20] no_context at ffffffff8169d2ba
#5 [ffff8800b2b27c70] __bad_area_nosemaphore at ffffffff8169d350
#6 [ffff8800b2b27cb8] bad_area_nosemaphore at ffffffff8169d4ba
#7 [ffff8800b2b27cc8] __do_page_fault at ffffffff816b017e
#8 [ffff8800b2b27d28] trace_do_page_fault at ffffffff816b03d6
#9 [ffff8800b2b27d68] do_async_page_fault at ffffffff816afa6a
#10 [ffff8800b2b27d80] async_page_fault at ffffffff816ac578
[exception RIP: strnstr+61]
RIP: ffffffff8132bd9d RSP: ffff8800b2b27e30 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880136e3e480 RCX: 0000000000000005
RDX: 00000000000000b4 RSI: ffff8800b2b27e42 RDI: ffffc90001240ffd
RBP: ffff8800b2b27e30 R8: 000000000000003a R9: 0000000000000001
R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff880136e3e480
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#11 [ffff8800b2b27e78] seq_read at ffffffff8122649a
#12 [ffff8800b2b27ee8] proc_reg_read at ffffffff8127041d
#13 [ffff8800b2b27f08] vfs_read at ffffffff81200bac
#14 [ffff8800b2b27f38] sys_read at ffffffff81201a6f
#15 [ffff8800b2b27f80] system_call_fastpath at ffffffff816b5009
RIP: 00007f86effc070d RSP: 00007f86eb335928 RFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffffff816b5009 RCX: ffffffffffffffff
RDX: 0000000000001fff RSI: 0000000001828360 RDI: 0000000000000015
RBP: 0000000000001fff R8: 0000000000000000 R9: 0000000001768110
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f86eb335a28
R13: 00007f86eb335a90 R14: 0000000001828360 R15: 00007f86eb335a28
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
从 panic 堆栈看,当时 AliYunDun 正在通过 read 系统调用尝试读文件,读的文件句柄保存在进入系统调用时的 RDI 寄存器中,即 fd = 0x15,通过句柄找到对应的文件是 /proc/1444/net/tcp6,这只是个普通的 /proc 文件系统下记录 tcp6 连接信息的文件。继续往下通过 VFS 的机制调用了 seq_read 函数,而在 seq_read 中调用 strnstr 时踩到了无效的 kernel 地址 ffffc90001240ffd 触发了 panic,strnstr 是查找子字符串的函数,根据它的函数原型和寄存器的值,ffffc90001240ffd 刚好是传入的第一个字符串指针 s1,而需要查找的子串指针 s2 保存在 RSI 中,这里是:
crash> rd ffff8800b2b27e42
ffff8800b2b27e42: 2e7f00344631303a :01F4...
函数调用回溯
既然知道触发异常的指针是 caller 传递过来的,那就要一层层向上找这个指针的来源。在 seq_read 中,实际上调用的是函数指针 m->op->show:
这里的局部变量 m 实际上是 seq_file 类型:struct seq_file *m = file->private_data;
,这个 file 就是 /proc/1444/net/tcp6 的 file 指针,因此可以快速获取到这里的 op 实际上指向 tcp6_seq_afinfo
crash> struct file.private_data ffff88003685af00
private_data = 0xffff880136e3e480
crash> struct seq_file.op 0xffff880136e3e480
op = 0xffffffff81ae6698 <tcp6_seq_afinfo+24>
而 tcp6_seq_afinfo->show 实际指向的是 tcp6_seq_show
static int tcp6_seq_show(struct seq_file *seq, void *v)
{
struct tcp_iter_state *st;
struct sock *sk = v;
if (v == SEQ_START_TOKEN) {
seq_puts(seq,
" sl "
"local_address "
"remote_address "
"st tx_queue rx_queue tr tm->when retrnsmt"
" uid timeout inode\n");
goto out;
}
st = seq->private;
switch (st->state) {
case TCP_SEQ_STATE_LISTENING:
case TCP_SEQ_STATE_ESTABLISHED:
if (sk->sk_state == TCP_TIME_WAIT)
get_timewait6_sock(seq, v, st->num);
else
get_tcp6_sock(seq, v, st->num);
break;
case TCP_SEQ_STATE_OPENREQ:
get_openreq6(seq, st->syn_wait_sk, v, st->num, st->uid);
break;
}
out:
return 0;
}
不可能发生的函数调用
tcp6_seq_show 函数很短,把所有可能的路径都走一遍发现根本就不会调用到 strnstr。这个情况通常只有两种可能:1. vmcore 解析的 symbol 出错了,这里不是 strnstr 函数而是其他函数;2. 函数回溯的时候出错了,strnstr 是由其他函数调用的。
仔细的读者应该发现了,op 实际上指向的并不是 tcp6_seq_afinfo 的起始地址,而是 tcp6_seq_afinfo+24
crash> struct seq_file.op 0xffff880136e3e480
op = 0xffffffff81ae6698 <tcp6_seq_afinfo+24>
照着这个思路找到实际调用的 show 函数起始地址是 0xffffffffc0290210
crash> struct seq_operations 0xffffffff81ae6698
struct seq_operations {
start = 0xffffffff815edf50 <tcp_seq_start>,
stop = 0xffffffff815ece70 <tcp_seq_stop>,
next = 0xffffffff815edec0 <tcp_seq_next>,
show = 0xffffffffc0290210
}
这是一个没有任何内核 symbol 的地址,但却跟普通函数一样,有寄存器压栈、取参数,寄存器出栈的行为,同时也可以看到,这里确实调用了 strnstr 函数。可以明确的是,这不是一个正常的内核函数,甚至是一个恶意的内核函数
crash> dis 0xffffffffc0290210 200
0xffffffffc0290210: nop DWORD PTR [rax+rax*1+0x0]
0xffffffffc0290215: push rbp
0xffffffffc0290216: mov rbp,rsp
0xffffffffc0290219: push r14
0xffffffffc029021b: push r13
0xffffffffc029021d: xor r13d,r13d
0xffffffffc0290220: push r12
0xffffffffc0290222: xor r12d,r12d
0xffffffffc0290225: push rbx
0xffffffffc0290226: mov rbx,rdi
0xffffffffc0290229: sub rsp,0x10
0xffffffffc029022d: mov rax,QWORD PTR gs:0x28
0xffffffffc0290236: mov QWORD PTR [rbp-0x28],rax
0xffffffffc029023a: xor eax,eax
0xffffffffc029023c: call QWORD PTR [rip+0x2506] # 0xffffffffc0292748
0xffffffffc0290242: mov r14d,eax
0xffffffffc0290245: jmp 0xffffffffc0290266
0xffffffffc0290247: nop WORD PTR [rax+rax*1+0x0]
0xffffffffc0290250: add r12,0x1
0xffffffffc0290254: sub QWORD PTR [rbx+0x18],0xb4
0xffffffffc029025c: cmp r12,0xb
0xffffffffc0290260: je 0xffffffffc02902f0
0xffffffffc0290266: mov ecx,DWORD PTR [r12*4-0x3fd6db60]
0xffffffffc029026e: lea rdi,[rbp-0x2e]
0xffffffffc0290272: mov rdx,0xffffffffc0291024
0xffffffffc0290279: mov esi,0x6
0xffffffffc029027e: xor eax,eax
0xffffffffc0290280: call 0xffffffff8132fb10 <snprintf>
0xffffffffc0290285: mov rdx,QWORD PTR [rbx]
0xffffffffc0290288: mov rax,QWORD PTR [rbx+0x18]
0xffffffffc029028c: lea rsi,[rbp-0x2e]
0xffffffffc0290290: lea rdi,[rdx+rax*1-0xb4]
0xffffffffc0290298: mov edx,0xb4
0xffffffffc029029d: call 0xffffffff8132bd60 <strnstr>
0xffffffffc02902a2: test rax,rax
0xffffffffc02902a5: jne 0xffffffffc0290250
0xffffffffc02902a7: cmp r12d,0x4
0xffffffffc02902ab: mov eax,r13d
0xffffffffc02902ae: mov rdx,QWORD PTR [rbx]
0xffffffffc02902b1: cmovle eax,r12d
0xffffffffc02902b5: cdqe
0xffffffffc02902b7: lea rsi,[rax*8-0x3fd6dba0]
0xffffffffc02902bf: mov rax,QWORD PTR [rbx+0x18]
0xffffffffc02902c3: lea rdi,[rdx+rax*1-0xb4]
0xffffffffc02902cb: mov edx,0xb4
0xffffffffc02902d0: call 0xffffffff8132bd60 <strnstr>
0xffffffffc02902d5: test rax,rax
0xffffffffc02902d8: jne 0xffffffffc0290250
0xffffffffc02902de: add r12,0x1
0xffffffffc02902e2: cmp r12,0xb
0xffffffffc02902e6: jne 0xffffffffc0290266
0xffffffffc02902ec: nop DWORD PTR [rax+0x0]
0xffffffffc02902f0: mov rcx,QWORD PTR [rbp-0x28]
0xffffffffc02902f4: xor rcx,QWORD PTR gs:0x28
0xffffffffc02902fd: mov eax,r14d
0xffffffffc0290300: jne 0xffffffffc029030f
0xffffffffc0290302: add rsp,0x10
0xffffffffc0290306: pop rbx
0xffffffffc0290307: pop r12
0xffffffffc0290309: pop r13
0xffffffffc029030b: pop r14
0xffffffffc029030d: pop rbp
0xffffffffc029030e: ret
......
真相大白
根据汇编大致走一遍函数的逻辑,调用 strnstr 的位置实际上是在判断 seq_file.buf 中是否含有子串 ":01F4"(保存在 0xffff8800b2b27e42 中),但由于 seq_file.count 是奇数(0xb1),在调用 strnstr 时保存在 rax 寄存器中,导致在 0xffffffffc0290290: lea rdi,[rdx+rax*1-0xb4]
中尝试取内存地址 rdx+rax*1-0xb4 的值时 panic,因为内存地址一定是个偶数。
进一步分析发现,这实际上是一个尝试隐藏监听 0x01F4 端口进程的 rootkit,但这 rootkit 写的蹩脚,在过滤 /proc 文件系统输出的时候出了 bug,造成了 kernel panic。由于这种隐藏方式只是过滤了 /proc 文件系统的输出,通过遍历所有进程监听的端口,顺藤摸瓜找到监听在 500(0x01F4)端口的进程
PID: 1019 TASK: ffff8800b284af70 CPU: 1 COMMAND: "ip6network"
FD SOCKET SOCK FAMILY:TYPE SOURCE-PORT DESTINATION-PORT
3 ffff8800b9f76a00 ffff8800b28787c0 INET:STREAM 0.0.0.0-500 0.0.0.0-0
把进程名 Google 一把便找到:https://www.trendmicro.com.cn/vinfo/cn/threat-encyclopedia/malware/trojan.linux.skidmap.uwejx
差点错怪了云安全中心!如果不是云安全中心在读 /proc/1444/net/tcp6 时触发了 panic,这个 rootkit 可能还要在操作系统中潜伏很久