错误
kernel: sd 0:2:0:0: SCSI error: return code
kernel: end_request: I/O error, dev sda, sector 2308509
kernel: Buffer I/O error on device sda2, logical block 2
kernel: Buffer I/O error on device sda2, logical block 2
硬盘扇区问题?
更换一块硬盘,重装系统就好了!
附带一篇好文
- 一次疑似磁盘故障的系统故障分析
- Posted on June 3, 2011 by Ding Honghui
- 我调查了一下这个问题,我得到的结论是磁盘应该是没有问题的。
- 从操作系统日志来分析: 磁盘libata子模块发起了一个的命令READ DMA EXT(op code 25h)[1]
- Jun 1 13:03:07 dna-616 kernel: [61661.891788] ata1.00: cmd 25/00:20:47:63:f8/00:00:1a:00:00/e0 tag 0 dma 16384 in
- 返回了一个错误为media error,具体为不可修复的错误(UNC Uncorrectable error – often due to bad sectors on the disk)[2]
- Jun 1 13:03:07 dna-616 kernel: [61661.891790] res 51/40:20:47:63:f8/40:00:1a:00:00/e0 Emask 0x9 (media error)
- Jun 1 13:03:07 dna-616 kernel: [61661.891885] ata1.00: status: { DRDY ERR }
- Jun 1 13:03:07 dna-616 kernel: [61661.891910] ata1.00: error: { UNC }
- 同时,磁盘驱动也报告了不可读扇区的问题以及reallocate扇区失败
- Jun 1 13:03:15 dna-616 kernel: [61672.300045] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
- 其中报告的不可读扇区为/dev/sda的452485959
- Jun 1 13:03:15 dna-616 kernel: [61672.300098] end_request: I/O error, dev sda, sector 452485959
- 为了确定该扇区是否是真正有问题,我采取了网上一位作者提供的方法[3],用hdparm直接读取该扇区
- hdparm --read-sector 452485959 /dev/sda | grep read
- 返回的结果为:
- reading sector 452485959: succeeded
- 对时间分析表明,该错误发生的时间段很集中: 在13:01-14:13总共12分钟之内出现了47次,分布在26个不同的扇区,随即在15:00左右系统重启。 和13:01之前约6小时比较,没有发生一起扇区读错误,在重启之后的6小时出现一次扇区读错误。
- 我检测了所有这26个报告有问题的扇区,都是可读的。
- 从以上现象推测,
- 首先,一个磁盘在某个时刻同时有26个扇区错误这个可能性是极小的。
- 其次,对这26个扇区的测试表明,这些扇区并没有问题
- 所以,最终我们可以推测,这个磁盘有问题的可能性是极小的。
- 排除了磁盘的问题,我们可以猜测RAID卡或者主板可能有问题,有可能是过热等等因素。
- [1] http://www.t13.org/Documents/UploadedDocuments/project/d1410r3b-ATA-ATAPI-6.pdf 8.26节
- [2] https://ata.wiki.kernel.org/index.php/Libata_error_messages#ATA_error_expansion
- [3] https://guust.tuxes.nl/~bas/wordpress/?p=12
- [4] 系统日志摘要
- Jun 1 13:03:07 dna-616 kernel: [61661.891725] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
- Jun 1 13:03:07 dna-616 kernel: [61661.891759] ata1.00: BMDMA stat 0x65
- Jun 1 13:03:07 dna-616 kernel: [61661.891788] ata1.00: cmd 25/00:20:47:63:f8/00:00:1a:00:00/e0 tag 0 dma 16384 in
- Jun 1 13:03:07 dna-616 kernel: [61661.891790] res 51/40:20:47:63:f8/40:00:1a:00:00/e0 Emask 0x9 (media error)
- Jun 1 13:03:07 dna-616 kernel: [61661.891885] ata1.00: status: { DRDY ERR }
- Jun 1 13:03:07 dna-616 kernel: [61661.891910] ata1.00: error: { UNC }
- Jun 1 13:03:07 dna-616 kernel: [61661.911623] ata1.00: configured for UDMA/133
- Jun 1 13:03:07 dna-616 kernel: [61661.947685] ata1.01: configured for UDMA/100
- Jun 1 13:03:07 dna-616 kernel: [61661.947685] ata1: EH complete
- Jun 1 13:03:15 dna-616 kernel: [61672.221414] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
- Jun 1 13:03:15 dna-616 kernel: [61672.221414] ata1.00: BMDMA stat 0x65
- Jun 1 13:03:15 dna-616 kernel: [61672.221414] ata1.00: cmd 25/00:20:47:63:f8/00:00:1a:00:00/e0 tag 0 dma 16384 in
- Jun 1 13:03:15 dna-616 kernel: [61672.221414] res 51/40:20:47:63:f8/40:00:1a:00:00/e0 Emask 0x9 (media error)
- Jun 1 13:03:15 dna-616 kernel: [61672.221414] ata1.00: status: { DRDY ERR }
- Jun 1 13:03:15 dna-616 kernel: [61672.221414] ata1.00: error: { UNC }
- Jun 1 13:03:15 dna-616 kernel: [61672.270903] ata1.00: configured for UDMA/133
- Jun 1 13:03:15 dna-616 kernel: [61672.299759] ata1.01: configured for UDMA/100
- Jun 1 13:03:15 dna-616 kernel: [61672.299803] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
- Jun 1 13:03:15 dna-616 kernel: [61672.299853] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
- Jun 1 13:03:15 dna-616 kernel: [61672.299896] Descriptor sense data with sense descriptors (in hex):
- Jun 1 13:03:15 dna-616 kernel: [61672.299927] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
- Jun 1 13:03:15 dna-616 kernel: [61672.300006] 1a f8 63 47
- Jun 1 13:03:15 dna-616 kernel: [61672.300045] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
- Jun 1 13:03:15 dna-616 kernel: [61672.300098] end_request: I/O error, dev sda, sector 452485959
- Jun 1 13:03:15 dna-616 kernel: [61672.300142] ata1: EH complete
- Jun 1 13:03:15 dna-616 kernel: [61672.300696] sd 0:0:0:0: [sda] Write Protect is off
- Jun 1 13:03:15 dna-616 kernel: [61672.300726] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
- Jun 1 13:03:20 dna-616 kernel: [61678.954058] INFO: task kjournald:1194 blocked for more than 120 seconds.
- Jun 1 13:03:20 dna-616 kernel: [61678.954058] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
- Jun 1 13:03:20 dna-616 kernel: [61678.954058] kjournald D 0000000000000000 0 1194 2
- Jun 1 13:03:20 dna-616 kernel: [61678.954058] ffff81023bce9d30 0000000000000046 0000000000000000 ffffffff8024ac66
- Jun 1 13:03:20 dna-616 kernel: [61678.962067] ffff81023f2e3730 ffff81023f13b4f0 ffff81023f2e39b8 0000000380248d72
- Jun 1 13:03:20 dna-616 kernel: [61678.962128] 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
- Jun 1 13:03:20 dna-616 kernel: [61678.962172] Call Trace:
- Jun 1 13:03:20 dna-616 kernel: [61678.962225] [<ffffffff8024ac66>] getnstimeofday+0x39/0x98
- Jun 1 13:03:20 dna-616 kernel: [61678.962256] [<ffffffff802bb28d>] sync_buffer+0x0/0x3f
- Jun 1 13:03:20 dna-616 kernel: [61678.962284] [<ffffffff80429667>] io_schedule+0x5c/0x9e
本文转自 dongnan 51CTO博客,原文链接:
http://blog.51cto.com/dngood/674164
http://blog.51cto.com/dngood/674164