ZPOOL health check and repair use scrub

简介:
zpool健康检查(scrub)主要用于通过checksum来检查zpool数据块的数据是否正常, 如果vdev是mirror或raidz的, 可以自动从其他设备来修复异常的数据块. 由于健康检查是IO开销很大的动作, 所以建议在不繁忙的时候操作(scrub只检查分配出去的数据块, 不会检查空闲的数据块, 所以只和使用率有关, 对于一个很大的zpool, 如果使用率很低的话, scrub也是很快完成的).

用法 : 
# zpool scrub zp1
查看zpool状态, 如下, 正在做scrub
# zpool status zp1
  pool: zp1
 state: ONLINE
  scan: scrub in progress since Tue Jun 17 15:17:01 2014
    19.7G scanned out of 1.23T at 56.7M/s, 6h11m to go
    0 repaired, 1.57% done
config:

        NAME         STATE     READ WRITE CKSUM
        zp1          ONLINE       0     0     0
          vd01vol01  ONLINE       0     0     0
          vd02vol01  ONLINE       0     0     0
          vd03vol01  ONLINE       0     0     0
          vd04vol01  ONLINE       0     0     0

errors: No known data errors

# zpool get all zp1
NAME  PROPERTY               VALUE                  SOURCE
zp1   size                   9.75T                  -
zp1   capacity               12%                    -
zp1   altroot                -                      default
zp1   health                 ONLINE                 -
zp1   guid                   5877722976139588848    default
zp1   version                -                      default
zp1   bootfs                 -                      default
zp1   delegation             on                     default
zp1   autoreplace            off                    default
zp1   cachefile              -                      default
zp1   failmode               wait                   default
zp1   listsnapshots          off                    default
zp1   autoexpand             off                    default
zp1   dedupditto             0                      default
zp1   dedupratio             1.01x                  -
zp1   free                   8.52T                  -
zp1   allocated              1.23T                  -
zp1   readonly               off                    -
zp1   ashift                 0                      default
zp1   comment                -                      default
zp1   expandsize             0                      -
zp1   freeing                0                      default
zp1   feature@async_destroy  enabled                local
zp1   feature@empty_bpobj    active                 local
zp1   feature@lz4_compress   active                 local

在做scrub时, 设备的io几乎消耗殆尽.
Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00  160.00    0.00   982.00     0.00     6.14     0.61    3.81   3.80  60.80
sdc               0.00     0.00  164.00    0.00   864.00     0.00     5.27     0.59    3.57   3.57  58.50
sdd               0.00     0.00  168.00    0.00  1100.00     0.00     6.55     0.61    3.63   3.64  61.20
sde               0.00     0.00  172.00    0.00  1116.00     0.00     6.49     0.61    3.52   3.52  60.60
sdf               0.00     0.00  160.00    0.00  1018.00     0.00     6.36     0.61    3.80   3.80  60.80
sdg               0.00     0.00  164.00    0.00   857.00     0.00     5.23     0.57    3.49   3.50  57.40
sdh               0.00     0.00  169.00    0.00   995.00     0.00     5.89     0.61    3.63   3.60  60.90
sdi               0.00     0.00  173.00    0.00  1066.00     0.00     6.16     0.60    3.49   3.46  59.90
dm-0              0.00     0.00  320.00    0.00  2000.00     0.00     6.25     1.99    6.21   3.12 100.00
dm-1              0.00     0.00  328.00    0.00  1721.00     0.00     5.25     1.91    5.83   3.01  98.60
dm-2              0.00     0.00  337.00    0.00  2095.00     0.00     6.22     1.99    5.92   2.97 100.00
dm-3              0.00     0.00  345.00    0.00  2182.00     0.00     6.32     1.99    5.77   2.90 100.00

scrub结束后, 可以通过-xv来查看是否有异常.
# zpool status zp1 -xv
pool 'zp1' is healthy


[参考]
1. man zpool
       zpool scrub [-s] pool ...

           Begins  a  scrub. The scrub examines all data in the specified pools to verify that it checksums correctly.
           For replicated (mirror or raidz) devices, ZFS automatically repairs any damage discovered during the scrub.
           The  "zpool  status" command reports the progress of the scrub and summarizes the results of the scrub upon
           completion.

           Scrubbing and resilvering are very similar operations. The difference is  that  resilvering  only  examines
           data that ZFS knows to be out of date (for example, when attaching a new device to a mirror or replacing an
           existing device), whereas scrubbing examines all data to discover silent errors due to hardware  faults  or
           disk failure.

           Because  scrubbing  and resilvering are I/O-intensive operations, ZFS only allows one at a time. If a scrub
           is already in progress, the "zpool scrub" command terminates it and starts a new scrub. If a resilver is in
           progress, ZFS does not allow a scrub to be started until the resilver completes.

           -s    Stop scrubbing.

       zpool status [-xvD] [-T d | u] [pool] ... [interval [count]]

           Displays  the  detailed health status for the given pools. If no pool is specified, then the status of each
           pool in the system is displayed. For more information on pool and device health, see  the  "Device  Failure
           and Recovery" section.

           If  a  scrub or resilver is in progress, this command reports the percentage done and the estimated time to
           completion. Both of these are only approximate, because the amount of data in the pool and the other  work-
           loads on the system can change.

           -x          Only display status for pools that are exhibiting errors or are otherwise unavailable. Warnings
                       about pools not using the latest on-disk format will not be included.

           -v          Displays verbose data error information, printing out a complete list of all data errors  since
                       the last complete pool scrub.

           -D          Display  a  histogram of deduplication statistics, showing the allocated (physically present on
                       disk) and referenced (logically referenced in the pool) block counts  and  sizes  by  reference
                       count.

           -T d | u    Display a time stamp.

                       Specify  u  for  a  printed representation of the internal representation of time. See time(2).
                       Specify d for standard date format. See date(1).
目录
相关文章
|
1月前
|
SQL
CHECK
【11月更文挑战第15天】
41 5
|
6月前
|
Java
【ERROR】‘<>‘ operator is not allowed for source level below 1.7
【ERROR】‘<>‘ operator is not allowed for source level below 1.7
79 0
|
7月前
|
Oracle 关系型数据库
The opatch minimum version check for patch failed
The opatch minimum version check for patch failed
68 2
|
前端开发 5G
Search space set group switching(一)
根据R17 38.300的描述,UE可以通过PDCCH monitoring adaptation机制实现power saving的目的,这其中就包括PDCCH monitoring skipping和search space set group (SSSG) switching两种机制。PDCCH monitoring skipping是R17才提出的机制,就是UE 可以在PDCCH skipping的时间内不监视 PDCCH的功能;search space set group (SSSG) switching R16提出,R17进行了部分增强。
Optional int parameter ‘id‘ is present but cannot be translated into a null value due to being ……
Optional int parameter ‘id‘ is present but cannot be translated into a null value due to being ……
324 0
‘Client‘ is not allowed to run in parallel.Would you like to stop the running one?
‘Client‘ is not allowed to run in parallel.Would you like to stop the running one?
589 0
‘Client‘ is not allowed to run in parallel.Would you like to stop the running one?
SAP WM Storage Type Capacity Check Method 5 (Usage check based on SUT)
SAP WM Storage Type Capacity Check Method 5 (Usage check based on SUT)
SAP WM Storage Type Capacity Check Method 5 (Usage check based on SUT)
|
网络协议 关系型数据库 PostgreSQL
|
关系型数据库 网络虚拟化
|
.NET 开发框架 数据建模