开发者社区> 余二五> 正文

Oracle 11g RAC 添加节点故障之--CRS资源启动故障

简介:
+关注继续查看

系统环境:

操作系统:RedHat EL5.5

集群软件: GI 11G

数据库软件:Oracle 11.2.0.1

故障原因:

   由于新节点(node3)是从原有的老节点(node2),克隆而来,在添加新节点时,忘了修改新节点的主机名,导致出现以下故障:

1、新节点CRS service启动正常

[root@node3 ~]# crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

2、listener 资源启动失败

[root@node3 ~]# crs_stat -t

Name           Type           Target    State     Host        

------------------------------------------------------------

ora.DG1.dg     ora....up.type ONLINE    ONLINE    node1      

ora....ER.lsnr ora....er.type ONLINE    ONLINE    node1      

ora....N1.lsnr ora....er.type ONLINE    ONLINE    node1      

ora....VOTE.dg ora....up.type ONLINE    ONLINE    node1      

ora.RCY.dg     ora....up.type ONLINE    ONLINE    node1      

ora.asm        ora.asm.type   ONLINE    ONLINE    node1      

ora.eons       ora.eons.type  ONLINE    ONLINE    node1      

ora.gsd        ora.gsd.type   ONLINE    ONLINE    node1      

ora....network ora....rk.type ONLINE    ONLINE    node1      

ora....SM1.asm application    ONLINE    ONLINE    node1      

ora....E1.lsnr application    ONLINE    ONLINE    node1      

ora.node1.gsd  application    ONLINE    ONLINE    node1      

ora.node1.ons  application    ONLINE    ONLINE    node1      

ora.node1.vip  ora....t1.type ONLINE    ONLINE    node1      

ora....SM2.asm application    ONLINE    ONLINE    node2      

ora....E2.lsnr application    ONLINE    ONLINE    node2      

ora.node2.gsd  application    ONLINE    ONLINE    node2      

ora.node2.ons  application    ONLINE    ONLINE    node2      

ora.node2.vip  ora....t1.type ONLINE    ONLINE    node2      

ora....SM3.asm application    ONLINE    ONLINE    node3      

ora....E3.lsnr application    OFFLINE   OFFLINE        

ora.node3.gsd  application    ONLINE    ONLINE    node3      

ora.node3.ons  application    ONLINE    ONLINE    node3      

ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    node2      

ora.ons        ora.ons.type   ONLINE    ONLINE    node1      

ora.pmydb.db   ora....se.type OFFLINE   OFFLINE              

ora....taf.svc ora....ce.type OFFLINE   OFFLINE              

ora.prod.db    ora....se.type OFFLINE   OFFLINE              

ora....ry.acfs ora....fs.type ONLINE    ONLINE    node1      

ora.scan1.vip  ora....ip.type ONLINE    ONLINE    node1  

[root@node3 ~]# crs_stat |grep lsn

NAME=ora.LISTENER.lsnr

NAME=ora.LISTENER_SCAN1.lsnr

NAME=ora.node1.LISTENER_NODE1.lsnr

NAME=ora.node2.LISTENER_NODE2.lsnr

NAME=ora.node3.LISTENER_NODE3.lsnr

NAME=ora.node3.LISTENER_NODE3.lsnr 服务启动失败

3、listener 资源故障解决方法:

手工启动:

[root@node3 ~]# crs_start  ora.node3.LISTENER_NODE3.lsnr -f

CRS-2527: Unable to start 'ora.LISTENER.lsnr' because it has a 'hard' dependency on 'ora.cluster_vip_net1.type'

CRS-2525: All instances of the resource 'ora.node1.vip' are already running; relocate is not allowed because the force option was not specified

CRS-2525: All instances of the resource 'ora.node2.vip' are already running; relocate is not allowed because the force option was not specified

CRS-0222: Resource 'ora.node3.LISTENER_NODE3.lsnr' has dependency error.

从以上日志可以看出,listener 在启动时,缺少vip的支持(node3 vip service)

[root@node3 ~]# crs_stat |grep vip

NAME=ora.node1.vip

TYPE=ora.cluster_vip_net1.type

NAME=ora.node2.vip

TYPE=ora.cluster_vip_net1.type

NAME=ora.scan1.vip

TYPE=ora.scan_vip.type

缺少node vip service!

添加vip service:

[root@node3 ~]# srvctl add -h

The SRVCTL add command adds the configuration and the Oracle Clusterware application to the OCR for the cluster database, named instances, named services, or for the named nodes.

Usage: srvctl add vip -n <node_name> -k <network_number> -A <name|ip>/<netmask>/[if1[|if2...]] [-v]

[root@node1 ~]# cat /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1      localhost

192.168.8.41   node1

192.168.8.43   node1-vip

10.1.1.1       node1-priv

192.168.8.42   node2

192.168.8.44   node2-vip

10.1.1.2       node2-priv

192.168.8.45   scan_ip

192.168.8.46   node3

192.168.8.47   node3-vip

10.1.1.3       node3-priv

[root@node3 ~]#  srvctl add vip -n node3 -A 192.168.8.47/255.255.255.0/eth0 -k 0

PRCN-2049 : The network attributes specified (network number: 0, subnet: 192.168.8.0, adapters: eth0) conflict with an already registered network (network number: 1, subnet: 192.168.8.0, adapters: eth0)

[root@node3 ~]#  srvctl add vip -n node3 -A 192.168.8.47/255.255.255.0/eth0 -k 1

[root@node3 ~]#

添加node3 vip service 成功!

[root@node3 ~]# crs_stat |grep vip

NAME=ora.node1.vip

TYPE=ora.cluster_vip_net1.type

NAME=ora.node2.vip

TYPE=ora.cluster_vip_net1.type

NAME=ora.node3.vip

TYPE=ora.cluster_vip_net1.type

NAME=ora.scan1.vip

TYPE=ora.scan_vip.type

[root@node3 ~]#

启动ora.node3.LISTENER_NODE3.lsnr service:

[root@node3 ~]# crs_stat |grep lsn

NAME=ora.LISTENER.lsnr

NAME=ora.LISTENER_SCAN1.lsnr

NAME=ora.node1.LISTENER_NODE1.lsnr

NAME=ora.node2.LISTENER_NODE2.lsnr

NAME=ora.node3.LISTENER_NODE3.lsnr


[root@node3 ~]# crs_start -f ora.node3.LISTENER_NODE3.lsnr

Attempting to start `ora.node3.vip` on member `node3`

Start of `ora.node3.vip` on member `node3` succeeded.

Attempting to start `ora.LISTENER.lsnr` on member `node3`

Start of `ora.LISTENER.lsnr` on member `node3` succeeded.

[root@node3 ~]#

node3 Grid 日志:

[root@node3 node3]# pwd

/u01/11.2.0/grid/log/node3

[root@node3 node3]# tail -f alertnode3.log

[gpnpd(7901)]CRS-2332:Error pushing GPnP profile to "mdns:service:gpnp._tcp.local.://node1:12802/agent=gpnpd,cname=node-cluster,host=node1,pid=3431/gpnpd h:node1 c:node-cluster".

2014-04-17 14:47:42.399

[ctssd(8044)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.

2014-04-17 14:48:14.444

[ctssd(8044)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.

2014-04-17 14:48:18.952

[cssd(7954)]CRS-1601:CSSD Reconfiguration complete. Active nodes are node1 node2 node3 .

[ctssd(8044)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.

2014-04-17 14:49:34.501

[ctssd(8044)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.

2014-04-17 14:49:58.515

[ctssd(8044)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.

2014-04-17 14:50:22.552

[ctssd(8044)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.

2014-04-17 14:50:46.595

[ctssd(8044)]CRS-2408:The clock on host node3 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.


4、添加Oracle 节点

CRS资源已经启动成功,但在node1 上添加 node3 的Oracle 节点时,报以下错误!

wKioL1NPlOKx6eb6AAKzT7Swcho980.jpg

    重新启动node的CRS service ,仍然不能解决,最后尝试在node2 上添加node3 的Oracle新节点,却成功(node1 不能添加,还是个不解之谜;可能是在添加CRS节点时是在node1上执行,添加出现了错误(在node3 运行root.sh时)导致node不能正常识别node3的新节点)


node3 记录以下日志:

[root@node3 node3]# tail -f alertnode3.log

[gpnpd(7901)]CRS-2332:Error pushing GPnP profile to "mdns:service:gpnp._tcp.local.://node1:12802/agent=gpnpd,cname=node-cluster,host=node1,pid=3431/gpnpd h:node1 c:node-cluster".

2014-04-17 14:47:42.399











本文转自 客居天涯 51CTO博客,原文链接:http://blog.51cto.com/tiany/1397335,如需转载请自行联系原作者

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
Unity3D热更新之LuaFramework篇[09]--资源热更新与代码热更新的具体实现
Unity3D热更新之LuaFramework篇[09]--资源热更新与代码热更新的具体实现一、准备工作1、制作一个用于热更新的界面此前我制作了一个大厅界面,并且放置了两个按钮:”排行榜“和”商城“,排行榜按钮已经用于打开排行榜页面。
1191 0
资源编排ROS之自定制资源(多云部署Terraform篇)
资源编排服务(Resource Orchestration Service, 简称ROS)是阿里云提供的一项简化云计算资源管理的服务。您可以遵循ROS定义的模板规范编写资源栈模板,在模板中定义所需的云计算资源(例如ECS实例、RDS数据库实例)、资源间的依赖关系等。ROS的编排引擎将根据模板自动完成所有资源的创建和配置,实现自动化部署及运维。 ROS资源编排接入了大量的阿里云资源,目前涉
1312 0
RAC 单节点报ora-1105 ora-01606的解决
最近自己搭了一套 11g rac的环境,基于redhat 6, 11g 用了11.2.0.3.0的cluster和db的安装包。 共享存储基于nfs,没有用asm. 环境打完以后,crs_stat -t看到相应的服务都起了,但是第二个节点上的实例不知道怎么回事却没起来。
856 0
一天内碰到的3个rac节点问题
说到问题,真是层出不穷,自己搭建了也不少的rac的环境的,但是在本地试验的时候总是会碰到一些问题,昨晚铲掉旧环境,搭建了两遍rac环境,终于在凌晨搭建好了环境,配置好EM,看了下效果,还不错,然后就把虚拟机设为suspend状态,早上打开虚拟机发现两个节点都自动停掉了,再次重启就启动不了了。
893 0
+关注
20377
文章
0
问答
文章排行榜
最热
最新
相关电子书
更多
JS零基础入门教程(上册)
立即下载
性能优化方法论
立即下载
手把手学习日志服务SLS,云启实验室实战指南
立即下载