又一存储类故障,
两台DELL R610服务器.
两台服务器同时报错:
提交qlogic CASE后建议执行:
不过执行到Gathering OS info: ... done
HBA : 04:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
SWITCH : Brocade 300
System : RHEL 5.6 64bit
Storage : EVA6400
device mapper : device-mapper-multipath-0.4.7-42.el5
两台服务器同时报错:
qla2xxx 0000:04:00.0: scsi(1:0:2): Abort command issued -- 1 1b22e 2002.
qla2xxx 0000:04:00.0: scsi(1:0:2): Abort command issued -- 1 2603b 2002.
qla2xxx 0000:04:00.0: scsi(1:0:2): Abort command issued -- 1 2603c 2002.
qla2xxx 0000:04:00.1: scsi(2:0:2): Abort command issued -- 1 27437 2002.
qla2xxx 0000:04:00.1: Performing ISP error recovery - ha= ffff81034a9404f8.
qla2xxx 0000:04:00.1: LIP reset occured (f700).
qla2xxx 0000:04:00.1: LIP occured (f700).
qla2xxx 0000:04:00.1: LIP reset occured (f7f7).
qla2xxx 0000:04:00.1: LOOP UP detected (4 Gbps).
qla2xxx 0000:04:00.1: qla2xxx_eh_bus_reset: reset succeeded
qla2xxx 0000:04:00.1: Mailbox command timeout occured. Issuing ISP abort.
qla2xxx 0000:04:00.1: Performing ISP error recovery - ha= ffff81034a9404f8.
qla2xxx 0000:04:00.1: Failed mailbox send register test
qla2xxx 0000:04:00.1: qla2x00_abort_isp: **** FAILED ****
qla2xxx 0000:04:00.1: scsi(2:0:2): Abort command issued -- 0 64981 2003.
qla2xxx 0000:04:00.1: Performing ISP error recovery - ha= ffff81034a9404f8.
qla2xxx 0000:04:00.1: LIP reset occured (f700).
qla2xxx 0000:04:00.1: LIP occured (f700).
qla2xxx 0000:04:00.1: LIP reset occured (f7f7).
qla2xxx 0000:04:00.1: LOOP UP detected (4 Gbps).
qla2xxx 0000:04:00.1: scsi(2:0:2): DEVICE RESET ISSUED.
qla2xxx 0000:04:00.1: Mailbox command timeout occured. Issuing ISP abort.
qla2xxx 0000:04:00.1: Performing ISP error recovery - ha= ffff81034a9404f8.
qla2xxx 0000:04:00.1: Failed mailbox send register test
qla2xxx 0000:04:00.1: qla2x00_abort_isp: **** FAILED ****
qla2xxx 0000:04:00.1: qla2xxx_eh_device_reset: device reset failed
qla2xxx 0000:04:00.1: Performing ISP error recovery - ha= ffff81034a9404f8.
qla2xxx 0000:04:00.1: LIP reset occured (f700).
qla2xxx 0000:04:00.1: LIP occured (f700).
qla2xxx 0000:04:00.1: LIP reset occured (f7f7).
qla2xxx 0000:04:00.1: LOOP UP detected (4 Gbps).
qla2xxx 0000:04:00.1: scsi(2:0:2): LOOP RESET ISSUED.
qla2xxx 0000:04:00.1: Mailbox command timeout occured. Issuing ISP abort.
qla2xxx 0000:04:00.1: Performing ISP error recovery - ha= ffff81034a9404f8.
qla2xxx 0000:04:00.1: Failed mailbox send register test
qla2xxx 0000:04:00.1: qla2x00_abort_isp: **** FAILED ****
rport-2:0-0: blocked FC remote port time out: saving binding
rport-2:0-1: blocked FC remote port time out: saving binding
提交qlogic CASE后建议执行:
ftp://ftp.qlogic.com/support/Hidden/scripts/qla_linux_info.sh
不过执行到Gathering OS info: ... done
Gathering /sys files: ... done
Gathering /proc files: ... done
Gathering /etc files: ... done
Gathering module information: ... done
Gathering ethernet info: ... done
Gathering infiniband info: ... done
Gathering miscellaneous system info:
的时候,就卡住了.IOSTAT显示HBA连接的存储使用率100%.
等待QLOGIC出示解决办法了。
之前碰到过类似的情况,使用MSA2312fc的存储,后来故障原因是存储的控制器有问题。
在其他地方还找到一个类似案例,据说是存储端得问题,需要重建RAID(也就意味着数据要迁出去再迁回来)。
# 20110608 CASE被转到了DELL
又有一套收集日志的方法
【其他】
尝试重新扫描设备 :
find /sys/class/scsi_host/host*/scan | while read line; do echo - - - > $line; done
参考 :