HP服务器官方管理工具hpacucli,通过该工具可以查看HP服务器的Raid状态是否正常(如果Raid卡出问题,会影响数据的读写速度),服务器硬盘是否正常(如果硬盘坏掉,严重的情况会丢失数据),服务器电源是否有故障等信息。
HP服务器官方管理工具hpasmcli,通过该工具可以很详细查看服务器CPU,内存,处理器,电源等的温度信息。
硬件状态信息获取:
1)安装hpacucli(下载地址:HP hpacucli工具)
[root@localhost ~]# rpm -ivh hpacucli-9.40-12.0.x86_64.rpm
1
|
[root@localhost ~]
# rpm -ivh hpacucli-9.40-12.0.x86_64.rpm
|
2)查看服务器RAID、硬盘是否正常。
[root@localhost ~]# hpacucli ctrl all show config
[root@localhost ~]# hpacucli ctrl all show config detail 可以详细地查看RAID和硬盘的信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
|
[root@localhost ~]
# hpacucli ctrl all show config
Smart Array E200i
in
Slot 0 (Embedded) (sn: PR97MP2834 )
array A (SAS, Unused Space: 0 MB)
logicaldrive 1 (558.7 GB, RAID 5, OK)
physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 300 GB, OK)
physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 300 GB, OK)
physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 300 GB, OK)
[root@localhost ~]
# hpacucli ctrl all show config detail
Smart Array E200i
in
Slot 0 (Embedded)
Bus Interface: PCI
Slot: 0
Serial Number: PR97MP2834
Cache Serial Number: P75B20C9SRO5BQ
RAID 6 (ADG) Status: Disabled
Controller Status: OK
Hardware Revision: A
Firmware Version: 1.82
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 15 secs
Surface Scan Mode: Idle
Post Prompt Timeout: 0 secs
Cache Board Present: True
Cache Status: Temporarily Disabled
Cache Status Details: Cache disabled; low batteries.
Cache Ratio: 50% Read / 50% Write
Drive Write Cache: Disabled
Total Cache Size: 128 MB
Total Cache Memory Available: 96 MB
No-Battery Write Cache: Disabled
Cache Backup Power Source: Batteries
Battery
/Capacitor
Count: 1
Battery
/Capacitor
Status: Failed (Replace Batteries)
SATA NCQ Supported: False
Array: A
Interface Type: SAS
Unused Space: 0 MB
Status: OK
Array Type: Data
Logical Drive: 1
Size: 558.7 GB
Fault Tolerance: 5
Heads: 255
Sectors Per Track: 32
Cylinders: 65535
Strip Size: 64 KB
Full Stripe Size: 128 KB
Status: OK
Caching: Enabled
Parity Initialization Status: Initialization Completed
Unique Identifier: 600508B1001038333420202020200004
Disk Name:
/dev/cciss/c0d0
Mount Points:
/boot
200 MB, / 542.5 GB
OS Status: LOCKED
Logical Drive Label: A0100C5FPR97MP2834 2E7A
Drive Type: Data
physicaldrive 1I:1:2
Port: 1I
Box: 1
Bay: 2
Status: OK
Drive Type: Data Drive
Interface Type: SAS
Size: 300 GB
Rotational Speed: 10000
Firmware Revision: HPDE
Serial Number: 6SE24FXD0000B124L36W
Model: HP EG0300FAWHV
PHY Count: 2
PHY Transfer Rate: 3.0Gbps, Unknown
physicaldrive 1I:1:3
Port: 1I
Box: 1
Bay: 3
Status: OK
Drive Type: Data Drive
Interface Type: SAS
Size: 300 GB
Rotational Speed: 10000
Firmware Revision: HPDE
Serial Number: 6SE27HWB0000B124KXPQ
Model: HP EG0300FAWHV
PHY Count: 2
PHY Transfer Rate: 3.0Gbps, Unknown
physicaldrive 1I:1:4
Port: 1I
Box: 1
Bay: 4
Status: OK
Drive Type: Data Drive
Interface Type: SAS
Size: 300 GB
Rotational Speed: 10000
Firmware Revision: HPDD
Serial Number: 6SE0KNEM0000B1012K5S
Model: HP EG0300FAWHV
PHY Count: 2
PHY Transfer Rate: 3.0Gbps, Unknown
[root@localhost ~]
#
|
硬件温度信息获取:
1)安装hpasmcli(下载地址:HP hpasmcli管理工具)
[root@localhost ~]# rpm -ivh hp-health-9.40-1602.44.rhel6.x86_64.rpm
1
|
[root@localhost ~]
# rpm -ivh hp-health-9.40-1602.44.rhel6.x86_64.rpm
|
2)查看服务器CPU,内存,处理器,电源等的温度信息
[root@localhost ~]# hpasmcli -s 'show' 查看类似于help的帮助信息,监控的时候要重点关注 DIMM(内存)、FANS(风扇)、POWERSUPPLY(电源模块)、SERVER(系统)、CPU、TEMP(温度)等信息。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
[root@localhost ~]
# hpasmcli -s 'show'
Invalid Arguments
SHOW ASR
SHOW BOOT
SHOW DIMM [ SPD ]
SHOW F1
SHOW FANS
SHOW HT
SHOW IML
SHOW IPL
SHOW NAME
SHOW PORTMAP
SHOW POWERMETER
SHOW POWERSUPPLY
SHOW PXE
SHOW SERIAL [ BIOS | EMBEDDED | VIRTUAL ]
SHOW SERVER
SHOW TEMP
SHOW TPM
SHOW UID
SHOW WOL
[root@localhost ~]
#
|
[root@localhost ~]# hpasmcli -s 'show TEMP' 查看服务器各部件的温度信息,其中Temp表示各部件当前的温度,Threshold表示临界温度,当当前温度超过临界温度的时候就要注意啦。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
[root@localhost ~]
# hpasmcli -s 'show TEMP'
Sensor Location Temp Threshold
------ -------- ---- ---------
#1 I/O_ZONE 46C/114F 65C/149F
#2 AMBIENT 21C/69F 40C/104F
#3 CPU#1 30C/86F 95C/203F
#4 CPU#1 30C/86F 95C/203F
#5 POWER_SUPPLY_BAY 30C/86F 60C/140F
#6 CPU#2 30C/86F 95C/203F
#7 CPU#2 30C/86F 95C/203F
[root@localhost ~]
#
|
[root@localhost ~]# hpasmcli -s 'show dimm' 查看内存信息
[root@localhost ~]# hpasmcli -s 'show TEMP' 查看硬件温度
[root@localhost ~]# hpasmcli -s 'show fans' 查看风扇信息
[root@localhost ~]# hpasmcli -s 'show powersupply' 查看电源模块
[root@localhost ~]# hpasmcli -s 'show server' 查看机器型号,序列号,CPU,内存大小
本文转自 justin_peng 51CTO博客,原文链接:http://blog.51cto.com/ityunwei2017/1895018,如需转载请自行联系原作者