nagios监控多台主机(nrpe)
在被监控机上安装nrpe
http://nchc.dl.sourceforge.net/project/nagiosplug/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz
http://nchc.dl.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.12/nrpe-2.12.tar.gz
须先安装nagios插件
方法1:
# useradd -s /sbin/nologin -M nagios
# apt-get install libssl-dev
# ln -s /usr/lib/x86_64-linux-gnu/libssl.so /usr/lib/
# tar -zxvf nagios-plugins-1.4.14.tar.gz
# cd nagios-plugins-1.4.14
# ./configure --prefix=/usr/local/nagios
# make && make install
# tar zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12
# ./configure
# make all
# make install-plugin (安装check_nrpe插件)
# make install-daemon (安装deamon)
# make install-daemon-config (安装配置文件)
编辑nrpe配置文件
# vi /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,192.168.10.8
默认为allowed_hosts=127.0.0.1
:wq
启动nrpe
# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
方法2(ubuntu):
# useradd -s /sbin/nologin -M nagios
# apt-get install nagios-nrpe-server nagios-plugins
编辑nrpe配置文件
# vi /etc/nagios/nrpe.cfg
allowed_hosts=127.0.0.1,192.168.10.8
默认为allowed_hosts=127.0.0.1
:wq
启动nrpe
# service nagios-nrpe-server start
查看NRPE 是否已经启动
# netstat -nltp |grep nrpe
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 5163/nrpe
测试NRPE 是否则正常工作
# /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.12
nrpe开机自启动:
# vi /etc/rc.local
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
:wq
查看被监控机上的check命令,监控机监控时要用
# vi /usr/local/nagios/etc/nrpe.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
:wq
注:可以自行添加也可以修改后面的值(报警值)
如:
command[check_mapper]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/mapper/VolGroup00-LogVol00 (监控硬盘卷)
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/sda1 (有的是sda,要视情况而定)
command[check_sda2]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/sda2 (可以对硬盘各个分区都进行监控)
在监控机器上安装nrpe
1、安装check_nrpe 插件
# apt-get install libssl-dev
# ln -s /usr/lib/x86_64-linux-gnu/libssl.so /usr/lib/
# tar -zxvf nrpe-2.8.1.tar.gz
# cd nrpe-2.8.1
# ./configure
# make all
# make install-plugin
只运行这一步就行了,因为只需要check_nrpe 插件
2、测试监控机与被监控机运行的nrpedaemon 之间的通信.
# /usr/local/nagios/libexec/check_nrpe -H 192.168.1.14
NRPE v2.8.1
看到已经正确返回了NRPE 的版本信息,说明一切正常.
3、对主机192.168.1.14进行监控
在commands.cfg 中增加对check_nrpe 的定义
# vi /usr/local/nagios/etc/objects/commands.cfg
# 'check_nrpe ' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
:wq
意义如下 :
command_name check_nrpe (定义命令名称为check_nrpe,在services.cfg 中要使用这个名称)
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ (这是定义实际运行的插件程序)(-c 后面带的$ARG1$参数是传给nrpe
daemon 执行的检测命令)
# cd /usr/local/nagios/etc/objects
# cp localhost.cfg ming.cfg
# vi ming.cfg
将host中的host_name改为ming,address改为192.168.1.14 (ming是随意写的)
将hostgroup_name改为ming,members也改为ming
define service {
use generic-service
host_name ming
service_description check_load
check_command check_nrpe!check_load
}
define service {
use generic-service
host_name ming
service_description check_users
check_command check_nrpe!check_users
}
define service {
use generic-service
host_name ming
service_description check_total
check_command check_nrpe!check_total_procs
}
define service {
use generic-service
host_name ming
service_description check_hda1
check_command check_nrpe!check_hda1
}
:wq
注:check_command后面的命令是依据被监控机的nrpe.cfg来写,那上面有才能写
# vi /usr/local/nagios/etc/nagios.cfg (任意处添加)
cfg_file=/usr/local/nagios/etc/objects/ming.cfg
:wq
重启nagios服务
# service nagios restart
本文转自linux博客51CTO博客,原文链接http://blog.51cto.com/yangzhiming/839733如需转载请自行联系原作者
yangzhimingg