实现NFS共享存储的WEB群集，前提是建立好web群集这是我另一篇建立web群集的博客： blog.csdn.net/qq_45714272…

基于NFS服务器的WEB群集原理

可以根据这个图片来理解这个原理，我在web1上有照片，但是我web2没有，我可以在两台web后加一个nfs存储服务器，NFS是一个共享目录或文件的服务。web1收到的数据，在发给nfs，nfs在给web2 1、安装NFS服务器

[root@sto yum.repos.d]# yum install -y nfs-utils
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Package matching 1:nfs-utils-1.3.0-0.21.el7.x86_64 already installed. Checking for update.
Nothing to do
[root@sto yum.repos.d]# systemctl restart rpcbind
[root@sto yum.repos.d]# systemctl enable rpcbind
[root@sto yum.repos.d]# systemctl enable nfs-server
Created symlink from /etc/systemd/system/multi-user.target.wants/nfs-server.service to /usr/lib/systemd/system/nfs-server.service.
[root@sto yum.repos.d]# systemctl start nfs-server
[root@sto yum.repos.d]# systemctl status firewalld
[root@sto http]# systemctl stop firewalld

防火墙开启的情况要增加nfs，rpc-bind,mounted服务，然后reload生效

2、准备NFS服务器资源

创建Export目录（共享目录）
[root@sto yum.repos.d]# mkdir /http
[root@sto yum.repos.d]# chmod a+x /http
[root@sto yum.repos.d]# vi /etc/exports
/http * (rw)
[root@sto yum.repos.d]# systemctl restart nfs-server
在其他节点进行测试
[root@rs1 ~]# showmount -e sto
Export list for sto:
/http *
[root@rs1 ~]# mkdir /mnt/nfs
[root@rs1 ~]# mount sto:/http /mnt/nfs
[root@rs1 ~]# cp ./anaconda-ks.cfg /mnt/nfs/test.txt
[root@rs2 ~]# showmount -e sto
Export list for sto:
/http *
[root@rs2 ~]# mkdir /mnt/nfs
[root@rs2 ~]# mount sto:/http /mnt/nfs
[root@rs2 ~]# cp ./anaconda-ks.cfg /mnt/nfs/test2.txt
[root@sto http]# ls
test2.txt  test.txt

创建NFS集群

[root@rs1 ~]# pcs resource create WebFS ocf:heartbeat:Filesystem \
> device='sto:/http' directory='/var/www/html' fstype='nfs' \
> op monitor interval=20s timeout=40s \
> op start timeout=60s op stop timeout=60s
[root@rs1 ~]# pcs resource create WebFS ocf:heartbeat:Filesystem  device='storage:/http' directory='/var/www/html' fstype='nfs' op monitor interval=20s timeout=40s > op start timeout=60s op stop timeout=60s
[root@rs1 ~]# pcs status
Cluster name: cluster1
Stack: corosync
Current DC: rs1 (version 1.1.20-5.el7-3c4c782f70) - partition with quorum
Last updated: Sat May  9 02:59:31 2020
Last change: Sat May  9 02:59:22 2020 by root via cibadmin on rs1
2 nodes configured
3 resources configured
Online: [ rs1 rs2 ]
Full list of resources:
 VirtualIP  (ocf::heartbeat:IPaddr2): Started rs1
 WebFS  (ocf::heartbeat:Filesystem):  Started rs2
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
[root@rs1 ~]# echo NFS > /var/www/html
[root@rs1 ~]# systemctl restart httpd

创建web集群

pcs resource create Wbsite ocf:heartbeat:apache httpd="/usr/sbin/httpd" \ configfile=/etc/httpd/conf/httpd.conf \

statusurl="http://localhost/server-status" \ op monitor interval=1min

之前碰到的问题是web群集起不来，后来在一篇文章发现要添加httpd="/usr/sbin/httpd" ，也就是httpd命令的位置，这样就可以了

[root@rs1 html]# pcs status
Cluster name: cluster1
Stack: corosync
Current DC: rs1 (version 1.1.20-5.el7-3c4c782f70) - partition with quorum
Last updated: Sat May  9 03:41:11 2020
Last change: Sat May  9 03:21:45 2020 by root via cibadmin on rs1
2 nodes configured
3 resources configured
Online: [ rs1 rs2 ]
Full list of resources:
 VirtualIP  (ocf::heartbeat:IPaddr2): Started rs1
 WebFS  (ocf::heartbeat:Filesystem):  Started rs1
 Wbsite (ocf::heartbeat:apache):  Started rs1
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

配置组和约束

通过将资源运行在同一节点上

[root@rs1 ~]# pcs resource group add a VirtualIP WebFS Website [root@rs1 html]# pcs status Cluster name: cluster1 Stack: corosync Current DC: rs1 (version 1.1.20-5.el7-3c4c782f70) - partition with quorum Last updated: Sat May 9 03:42:47 2020 Last change: Sat May 9 03:42:44 2020 by root via cibadmin on rs1

2 nodes configured 3 resources configured

Online: [ rs1 rs2 ]

Full list of resources:

Resource Group: a VirtualIP (ocf::heartbeat:IPaddr2): Started rs1 WebFS (ocf::heartbeat:Filesystem): Started rs1 Wbsite (ocf::heartbeat:apache): Started rs1

Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled

通过约束管理资源的启动

[root@rs1 ~]# pcs constraint order VirtualIP then Website 
Adding VirtualIP Wbsite (kind: Mandatory) (Options: first-action=start then-action=start)
[root@rs1 ~]# pcs constraint order WebFS then Website
Adding WebFS Wbsite (kind: Mandatory) (Options: first-action=start then-action=start)

资源在web1的情况下访问VIP

资源在web2的情况下访问：

[root@rs1 ~]# pcs status
Cluster name: cluster1
Stack: corosync
Current DC: rs1 (version 1.1.20-5.el7-3c4c782f70) - partition with quorum
Last updated: Sat May  9 03:55:54 2020
Last change: Sat May  9 03:55:40 2020 by root via cibadmin on rs1
2 nodes configured
3 resources configured
Node rs1: standby
Online: [ rs2 ]
Full list of resources:
 Resource Group: a
     VirtualIP  (ocf::heartbeat:IPaddr2): Started rs2
     WebFS  (ocf::heartbeat:Filesystem):  Started rs2
     Wbsite (ocf::heartbeat:apache):  Starting rs2
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

结果还是一样的，这样就完成啦基于NFS存储的WEB群集了

还有个重点：

[root@rs1 html]# systemctl status httpd
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sat 2020-05-09 03:37:45 EDT; 10min ago
     Docs: man:httpd(8)
           man:apachectl(8)
  Process: 27583 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=1/FAILURE)
  Process: 27582 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE)
 Main PID: 27582 (code=exited, status=1/FAILURE)
May 09 03:37:45 rs1 httpd[27582]: (98)Address already in use: AH00072: make_sock: co...:80
May 09 03:37:45 rs1 httpd[27582]: (98)Address already in use: AH00072: make_sock: co...:80
May 09 03:37:45 rs1 httpd[27582]: no listening sockets available, shutting down
May 09 03:37:45 rs1 httpd[27582]: AH00015: Unable to open logs
May 09 03:37:45 rs1 systemd[1]: httpd.service: main process exited, code=exited, sta...URE
May 09 03:37:45 rs1 kill[27583]: kill: cannot find process ""
May 09 03:37:45 rs1 systemd[1]: httpd.service: control process exited, code=exited s...s=1
May 09 03:37:45 rs1 systemd[1]: Failed to start The Apache HTTP Server.
May 09 03:37:45 rs1 systemd[1]: Unit httpd.service entered failed state.
May 09 03:37:45 rs1 systemd[1]: httpd.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
[root@rs1 html]# ps aux |grep http
root      18067  0.0  0.3 224068  3476 ?        Ss   03:19   0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    18068  0.0  0.3 224204  3752 ?        S    03:19   0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    18069  0.0  0.3 224204  3752 ?        S    03:19   0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    18070  0.0  0.3 224204  3752 ?        S    03:19   0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    18071  0.0  0.3 224204  3756 ?        S    03:19   0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    18072  0.0  0.3 224204  3764 ?        S    03:19   0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    31002  0.0  0.3 224204  3704 ?        S    03:44   0:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
root      35170  0.0  0.0 112712   952 pts/1    R+   03:49   0:00 grep --color=auto http

我们的httpd服务显示暂停了，但是看进程还是在运行的，是因为：当我们添加web群集之后，httpd不在由systemctl控制，而是由群集组件控制，所以我们用进程看httpd还是在运行的

用PCS建立WEB集群，一直不生效怎么解决！

心累，耗了我2天时间才解决。

[root@rs1 ~]# pcs resource create Website ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf

statusurl="http://localhost/server-status" \ op monitor interval=1min

[root@rs1 ~]# pcs status Cluster name: cluster1 Stack: corosync Current DC: rs1 (version 1.1.20-5.el7-3c4c782f70) - partition with quorum Last updated: Sat May 9 02:29:59 2020 Last change: Sat May 9 02:28:36 2020 by root via cibadmin on rs1

2 nodes configured 2 resources configured

Online: [ rs1 rs2 ]

Full list of resources:

VirtualIP (ocf::heartbeat:IPaddr2): Started rs1 Website (ocf::heartbeat:apache): Stopped

Failed Resource Actions:

Website_start_0 o n rs1 'unknown error' (1): call=19, status=Timed Out, exitreason='', last-rc-change='Sat May 9 02:29:17 2020', queued=0ms, exec=40004ms

Website_start_0 on rs2 'unknown error' (1): call=14, status=Timed Out, exitreason='', last-rc-change='Sat May 9 02:28:36 2020', queued=0ms, exec=40006ms

Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled

在放个日志文件

May  9 02:29:48 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:49 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:49 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:50 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:50 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:51 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:51 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:52 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:52 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:53 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:53 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:54 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:54 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:55 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:55 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:56 b apache(Website)[12391]: INFO: apache not running
May  9 02:29:56 b apache(Website)[12391]: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
May  9 02:29:57 b lrmd[9228]: warning: Website_start_0 process (PID 12391) timed out
May  9 02:29:57 b lrmd[9228]: warning: Website_start_0:12391 - timed out after 40000ms
May  9 02:29:57 b crmd[9231]:   error: Result of start operation for Website on rs1: Timed Out
May  9 02:29:57 b crmd[9231]: warning: Action 5 (Website_start_0) on rs1 failed (target: 0 vs. rc: 1): Error
May  9 02:29:57 b crmd[9231]:  notice: Transition aborted by operation Website_start_0 'modify' on rs1: Event failed
May  9 02:29:57 b crmd[9231]:  notice: Transition 16 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-16.bz2): Complete
May  9 02:29:57 b pengine[9230]: warning: Processing failed start of Website on rs1: unknown error
May  9 02:29:57 b pengine[9230]: warning: Processing failed start of Website on rs1: unknown error
May  9 02:29:57 b pengine[9230]: warning: Processing failed start of Website on rs2: unknown error
May  9 02:29:57 b pengine[9230]: warning: Forcing Website away from rs1 after 1000000 failures (max=1000000)
May  9 02:29:57 b pengine[9230]: warning: Forcing Website away from rs2 after 1000000 failures (max=1000000)
May  9 02:29:57 b pengine[9230]:  notice:  * Stop       Website       (        rs1 )   due to node availability
May  9 02:29:57 b pengine[9230]:  notice: Calculated transition 17, saving inputs in /var/lib/pacemaker/pengine/pe-input-17.bz2

解决方案： 重新创建web集群

[root@rs1 html]# pcs resource create Wbsite ocf:heartbeat:apache 
httpd="/usr/sbin/httpd" \
configfile=/etc/httpd/conf/httpd.conf  \
statusurl="http://localhost/server-status" \
op monitor interval=1min
可以看出我在原来的基础上加了一个httpd="/usr/sbin/httpd" \，也就是httpd命令的位置，看结果：
[root@rs1 html]# pcs status
Cluster name: cluster1
Stack: corosync
Current DC: rs1 (version 1.1.20-5.el7-3c4c782f70) - partition with quorum
Last updated: Sat May  9 03:33:01 2020
Last change: Sat May  9 03:21:45 2020 by root via cibadmin on rs1
2 nodes configured
3 resources configured
Online: [ rs1 rs2 ]
Full list of resources:
 VirtualIP  (ocf::heartbeat:IPaddr2): Started rs1
 WebFS  (ocf::heartbeat:Filesystem):  Started rs1
 Wbsite (ocf::heartbeat:apache):  Started rs1
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

解决了！

基于NFS存储建立WEB群（PCS工具）

基于NFS服务器的WEB群集原理

用PCS建立WEB集群，一直不生效怎么解决！

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

基于NFS存储建立WEB群（PCS工具）

基于NFS服务器的WEB群集原理

用PCS建立WEB集群，一直不生效怎么解决！

热门文章

最新文章

相关课程

相关电子书

相关实验场景