菜鸟学Linux 第086篇笔记 HA 多节点概念和实现的软件
内容总览
上节回顾
多节点集群的一些基本概念和模块
可提供Messaging Layer 的软件
可提供CRM的软件
STONITH 设备
上节回顾
Messaging Layer
CRM (Cluster Resource Manager)
RA (Resource Agent)
ha-aware application
可以自行收集Messaging Layer 所提供的集群信息,而作出相应的决策的软件
资源粘性 资源对某节点的依赖程序,通过score定义
资源约束
location 资源对节点的倾向程序
coloation 排列约束,资源间的依赖性
order 资源的启动或关闭的次序
多节点集群
组播网络通信
quorum 法定票数
由PE (Policy Engine) 计算出 DC (Designated coodinator)
法定票数可以自行定义 一个节点可以有多张票数
集群资源管理策略 without_quorum_policy
freeze
stop
ignore
(一个高可用集群资源隔离是必须的)
failover domain
故障转移域:当某台节点服务出现故障时默认要将其服务转移的节点区域范围
提供 Messaging Layer 的软件 RHCS: RedHat Cluster Suite
heartbeat (v1, v2, v3)
v1 自带资源管理器 haresources
v2 自带的资源管理器 haresources, crm
v3 分成 heartbeat, pacemaker, cluster-glue
资源管理器crm发展为独立的项目,pacemaker
corosync
(RHEL 6.*默认提供的Messaging Layer)
cman
(RHEL 5.*默认提供的Messaging Layer)
keepalived
ultramonkey (不太活跃)
提供CRM (Cluster Resource Manager) 的软件
haresource, crm (为heartbeat v1, v2提供的CRM)
pacemaker (为hearbeat v3或 corosync提供的CRM)
rgmanager (为cman提供的CRM)
Resource type
Primitive 只可运行在一个节点的资源
clone 多节点同时运行的资源
group 一个组内的节点都可运行的资源
master/slave 只可运行在两个节点上的资源
RA (Resource Agent)
RA Classes
legacy heartbeat v1 RA
LSB (/etc/rc.d/init.d/*)
OCF (Open Cluster Framework)
pacemaker
linbit (drbd)
STONITH (Shot The Other Node In The Head)
资源隔离级别
节点级别
STONITH
资源级别
FC SAN
STONITH 设备
1. Power Distribution Units (PDU) 电源分布组件/单元
2. Uninterruptible Power Supplies (UPS) 不间断电源供应
3. Blade Power Control Device 刀片服务器的电源控制设备
4. Lights-out Device 轻量级的服务管理组件(硬件)
5. Testing Devices 测试设备
Stonith设备详解
1、Power Distribution Units (PDU)
Power Distribution Units are an essential element in managing power capacity
and functionality for critical network, server and data center equipment.
They can provide remote load monitoring of connected equipment and individual
outlet power control for remote power recycling.
2、Uninterruptible Power Supplies (UPS)
A stable power supply provides emergency power to connected equipment by
supplying power from a separate source in the event of utility power failure.
3、Blade Power Control Devices
If you are running a cluster on a set of blades, then the power control device
in the blade enclosure is the only candidate for fencing. Of course, this
device must be capable of managing single blade computers.
4、Lights-out Devices
Lights-out devices (IBM RSA, HP iLO, Dell DRAC) are becoming increasingly
popular and may even become standard in off-the-shelf computers. However, they
are inferior to UPS devices, because they share a power supply with their host
(a cluster node). If a node stays without power, the device supposed to control
it would be just as useless. In that case, the CRM would continue its attempts
to fence the node indefinitely while all other resource operations would wait
for the fencing/STONITH operation to complete.
5、Testing Devices
Testing devices are used exclusively for testing purposes. They are usually
more gentle on the hardware. Once the cluster goes into production, they must
be replaced with real fencing devices.
ssh 172.16.100.1 'reboot'
meatware
STONITH的实现:
stonithd
stonithd is a daemon which can be accessed by local processes or over the
network. It accepts the commands which correspond to fencing operations: reset,
power-off, and power-on. It can also check the status of the fencing device.
The stonithd daemon runs on every node in the CRM HA cluster. The stonithd
instance running on the DC node receives a fencing request from the CRM. It is
up to this and other stonithd programs to carry out the desired fencing
operation.
STONITH Plug-ins (插件)
For every supported fencing device there is a STONITH plug-in which is capable
of controlling said device. A STONITH plug-in is the interface to the fencing
device.
On each node, all STONITH plug-ins reside in /usr/lib/stonith/plugins
(or in /usr/lib64/stonith/plugins for 64-bit architectures). All STONITH
plug-ins look the same to stonithd, but are quite different on the other side
reflecting the nature of the fencing device.
Some plug-ins support more than one device. A typical example is ipmilan
(or external/ipmi) which implements the IPMI protocol and can control any
device which supports this protocol.
httpd HA
heartbeat udp 64
本文转自Winthcloud博客51CTO博客,原文链接http://blog.51cto.com/winthcloud/1893352如需转载请自行联系原作者
Winthcloud