系统环境:
操作系统: AIX 5300-09
集群软件: CRS 10.2.0.1
数据库: Oracle 10.2.0.1
系统架构图
故障现象:
系统重启后,在节点上CRS 启动失败或CRS服务启动成功,CRS Resource无法ONLINE。
[root@aix213 racg] cat /etc/hosts
1
2
3
4
5
6
7
8
|
127.0
.
0.1
loopback localhost # loopback (lo0) name/address
192.168
.
8.214
aix214
192.168
.
8.106
aix106
192.168
.
8.213
aix213
192.168
.
8.115
aix213-vip
10.10
.
10.213
aix213-priv
192.168
.
8.113
aix214-vip
10.10
.
10.214
aix214-priv
|
每个node都绑定了其他节点的vip ip ,vip ip address 绑定到了所有的节点上!
[oracle@aix214 ~]$ifconfig -a
1
2
3
4
5
|
en0: flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN>
inet
192.168
.
8.214
netmask
0xffffff00
broadcast
192.168
.
8.255
inet
192.168
.
8.113
netmask
0xffffff00
broadcast
192.168
.
8.255
inet
192.168
.
8.115
netmask
0xffffff00
broadcast
192.168
.
8.255
tcp_sendspace
131072
tcp_recvspace
65536
rfc1323
0
|
[oracle@aix213 ~]$ifconfig -a
1
2
3
4
5
|
en0: flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN>
inet
192.168
.
8.213
netmask
0xffffff00
broadcast
192.168
.
8.255
inet
192.168
.
8.113
netmask
0xffffff00
broadcast
192.168
.
8.255
inet
192.168
.
8.115
netmask
0xffffff00
broadcast
192.168
.
8.255
tcp_sendspace
131072
tcp_recvspace
65536
rfc1323
0
|
[root@aix214 /]$crsctl check crs
1
2
3
|
CSS appears healthy
CRS appears healthy
EVM appears healthy
|
[root@aix214 /]$crs_stat -t
1
2
3
4
5
6
7
8
9
10
11
12
13
|
Name Type Target State Host
------------------------------------------------------------
ora...
.13.
lsnr application ONLINE OFFLINE
ora.aix213.gsd application ONLINE OFFLINE
ora.aix213.ons application ONLINE OFFLINE
ora.aix213.vip application ONLINE OFFLINE
ora...
.14.
lsnr application ONLINE OFFLINE
ora.aix214.gsd application ONLINE OFFLINE
ora.aix214.ons application ONLINE OFFLINE
ora.aix214.vip application ONLINE OFFLINE
ora.prod.db application ONLINE OFFLINE
ora....d1.inst application ONLINE OFFLINE
ora....d2.inst application ONLINE OFFLINE
|
查看日志:
[root@aix213 racg]cd /u01/crs_1/log/aix213/racg
1
2
3
4
5
6
7
8
9
10
11
12
13
|
[root
@aix213
racg]$more ora.aix213.vip.log
Oracle Database 10g CRS Release
10.2
.
0.1
.
0
Production Copyright
1996
,
2005
Oracle. All rig
hts reserved.
2014
-
05
-
09
17
:
07
:
05.624
: [ RACG][
1
] [
385112
][
1
][ora.aix213.vip]: Invalid parameters, or
failed to bring up VIP (host=aix213)
2014
-
05
-
09
17
:
07
:
05.624
: [ RACG][
1
] [
385112
][
1
][ora.aix213.vip]: clsrcexecut: env ORACLE
_CONFIG_HOME=/u01/crs_1
2014
-
05
-
09
17
:
07
:
05.625
: [ RACG][
1
] [
385112
][
1
][ora.aix213.vip]: clsrcexecut: cmd = /u01
/crs_1/bin/racgeut -e _USR_ORA_DEBUG=
0
54
/u01/crs_1/bin/racgvip start aix213
2014
-
05
-
09
17
:
07
:
05.625
: [ RACG][
1
] [
385112
][
1
][ora.aix213.vip]: clsrcexecut: rc =
1
, ti
me =
0
.345s
2014
-
05
-
09
17
:
07
:
06.832
: [ RACG][
1
] [
385112
][
1
][ora.aix213.vip]: Invalid parameters, or
failed to bring up VIP (host=aix213)
|
......
初步判断是在节点上VIP配置有问题!
解决方法1:
1、关闭所有node上的nodeapps
[oracle@aix213 ~]$srvctl stop nodeapps -n aix213
[oracle@aix213 ~]$srvctl stop nodeapps -n aix214
[oracle@aix213 ~]$srvctl modify nodeapps -A 192.168.8.115/255.255.255.0/en0 -n aix213 -o $ORACLE_HOME
[oracle@aix213 ~]$srvctl modify nodeapps -A 192.168.8.113/255.255.255.0/en0 -n aix214 -o $ORACLE_HOME
2、停止所有节点的crs
[oracle@aix213 ~]$crsctl stop crs
[oracle@aix214 ~]$crsctl stop crs
3、重新启动所有节点的crs
[oracle@aix213 ~]$crsctl start crs
[oracle@aix214 ~]$crsctl start crs
解决方法2:
1、更新CRS中VIP信息
[root@aix213 racg] cat /etc/hosts
1
2
3
4
5
6
7
8
|
127.0
.
0.1
loopback localhost # loopback (lo0) name/address
192.168
.
8.214
aix214
192.168
.
8.106
aix106
192.168
.
8.213
aix213
192.168
.
8.115
aix213-vip
10.10
.
10.213
aix213-priv
192.168
.
8.113
aix214-vip
10.10
.
10.214
aix214-priv
|
2、修改VIP
1
2
3
|
[root
@aix214
/]$srvctl modify nodeapps -n aix213 -o /u01/app/oracle/product/
10.2
.
0
/db_1/ -A
192.168
.
8.115
/
255.255
.
255.0
/en0
[root
@aix214
/]$srvctl modify nodeapps -n aix214 -o /u01/app/oracle/product/
10.2
.
0
/db_1/ -A
192.168
.
8.113
/
255.255
.
255.0
/en0
|
3、以root身份执行vipca
4、重新启动CRS服务
1
2
3
4
|
[root
@aix214
/]$crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
|
[root@aix214 /]$crs_stat -t
1
2
3
4
5
6
7
8
9
10
11
12
13
|
Name Type Target State Host
------------------------------------------------------------
ora....
13
.lsnr application OFFLINE OFFLINE
ora.aix213.gsd application ONLINE ONLINE aix213
ora.aix213.ons application ONLINE ONLINE aix213
ora.aix213.vip application ONLINE ONLINE aix213
ora....
14
.lsnr application ONLINE OFFLINE
ora.aix214.gsd application ONLINE ONLINE aix214
ora.aix214.ons application ONLINE ONLINE aix214
ora.aix214.vip application ONLINE ONLINE aix214
ora.prod.db application ONLINE OFFLINE
ora....d1.inst application OFFLINE OFFLINE
ora....d2.inst application ONLINE OFFLINE
|
手工启动Listener service:
1
2
3
4
5
6
7
8
9
10
11
|
[root
@aix214
/]$crs_stat |grep lsn
NAME=ora.aix213.LISTENER_AIX213.lsnr
NAME=ora.aix214.LISTENER_AIX214.lsnr
[root
@aix214
/]$crs_start -f ora.aix214.LISTENER_AIX214.lsnr
Attempting to start `ora.aix214.LISTENER_AIX214.lsnr` on member `aix214`
Start of `ora.aix214.LISTENER_AIX214.lsnr` on member `aix214` succeeded.
[root
@aix214
/]$crs_start -f ora.aix213.LISTENER_AIX213.lsnr
Attempting to start `ora.aix213.LISTENER_AIX213.lsnr` on member `aix213`
Start of `ora.aix213.LISTENER_AIX213.lsnr` on member `aix213` succeeded.
|
至此CRS启动成功:
[oracle@aix213 ~]$crs_stat -t
1
2
3
4
5
6
7
8
9
10
11
12
13
|
Name Type Target State Host
------------------------------------------------------------
ora....
13
.lsnr application ONLINE ONLINE aix213
ora.aix213.gsd application ONLINE ONLINE aix213
ora.aix213.ons application ONLINE ONLINE aix213
ora.aix213.vip application ONLINE ONLINE aix213
ora....
14
.lsnr application ONLINE ONLINE aix214
ora.aix214.gsd application ONLINE ONLINE aix214
ora.aix214.ons application ONLINE ONLINE aix214
ora.aix214.vip application ONLINE ONLINE aix214
ora.prod.db application ONLINE ONLINE aix213
ora....d1.inst application ONLINE ONLINE aix213
ora....d2.inst application ONLINE ONLINE aix214
|
@至此,问题基本解决