8、nagios安装
8.1 下载需要的文件
Nagios-3.4.1.tar.gz
Nagios-plugins-1.4.16.tar.gz
Ndoutils-1.5.2.tar.gz
Npc2.0.4.tar.gz(这个不好找)
Nrpe-2.13.tar.gz
8.2 nagios和nagios plugins的安装
Tar zxvf nagios-3.4.1.tar.gz
Cd nagios-3.4.1
./configure –prefix=/var/www/localhost/htdocs/nagios
Make all
Mkdir /var/www/localhost/htdocs/nagios
Useradd nagios
Passwd nagios
Groupadd nagios
Usermod –G nagios nagios
Usermod –G nagios apache
Chown –R nagios:nagios /var/www/localhost/htdocs/nagios
Make install
Make install-init
Make install-commandmode
Make install-config
Cd ..
Tar zxvf nagios-plugins- 1.4.16.tar.gz
Cd nagios-plugins-1.4.16
./configure –prefix=/var/www/localhost/htdocs/nagios/
Make
Make install
8.3 httpd.conf的修改
vi /etc/apache2/httpd.conf
添加
#setting for nagios 20120815
ScriptAlias /nagios/cgi-bin /var/www/localhost/htdocs/nagios/sbin
<Directory /var/www/localhost/htdocs /nagios/sbin">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /var/www/localhost/htdocs /nagios/etc/htpasswd
# For this directory to access the authentication file
|
Require valid-user
</Directory>
Alias /nagios /var/www/localhost/htdocs /nagios/share
<Directory /var/www/localhost/htdocs /nagios/share">
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /var/www/localhost/htdocs/nagios/etc/htpasswd
#For this directory to access the authentication file
Require valid-user
</Directory>
|
8.4 增加验证用户
Htpasswd –c /var/www/localhost/htdocs/nagios/etc/htpasswd nagios
查看验证问内容
Less /usr/local/nagios/etc/htpasswd
做的这里我们已经可以访问nagios 主页了
http://192.168.254.123/nagios
登录进去以后除了主页什么都打不开
8.5 nagios配置
Cd /var/www/localhost/htdocs/nagios/etc
Vi nagios.cfg
#cfg_file=/usr/local/nagios/etc/localhost.cfg
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/contacts.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/hosts.cfg
cfg_file=/usr/local/nagios/etc/services.cfg
cfg_file=/usr/local/nagios/etc/timeperiods.cfg
check_external_commands=1
command_check_interval=10s
#command_check_interval=-1
|
vi cgi.cfg
authorized_for_system_information=nagiosadmin,nagios
authorized_for_configuration_information=nagiosadmin,nagios
authorized_for_system_commands=nagios
authorized_for_all_services=nagiosadmin,nagios
authorized_for_all_hosts=nagiosadmin,test,nagios
authorized_for_all_service_commands=nagiosadmin,nagios
authorized_for_all_host_commands=nagiosadmin,nagios
|
Cd objects
Ls
看到如下配置文件
commands.cfg services.cfgwindows.cfgswitch.cfgcontacts.cfglocalhost.cfgtemplates.cfgprinter.cfgtimeperiods.cfg
|
备份好系统自带的文件开始编译
Mv contacts.cfg contacts.cfg.backup
Mv timeperiods.cfg timeperiods.cfg.backup
vi timeperiods.cfg(非必要,系统自带的模板timeperiods.cfg编译非常完善)
define timeperiod{
timeperiod_name24x7
alias24 Hours A Day,7Days A Week
sunday00:00-24:00
monday00:00-24:00
tuesday00:00-24:00
wednesday00:00-24:00
thursday00:00-24:00
friday00:00-24:00
saturday00:00-24:00
}
|
vi contacts.cfg
define contact{
contact_namenagios
aliasnagios admin
service_notification_period24x7
host_notification_period24x7
service_notification_optionsw,u,c,r
host_notification_optionsd,u,r
service_notification_commandsnotify-service-by-email
host_notification_commandsnotify-host-by-email
emailaaa@abc.com
pager137********
address1CHN
address2SHA
}
|
vi contactgroups.cfg
define contactgroup{
contactgroup_namenagios
aliasnagiosAdministrators
membersnagios
}
|
vi hosts.cfg
define host{
host_namelocalhost
aliaslocalhost
address192.168.254.123
check_commandcheck-host-alive
max_check_attempts5
check_period24x7
contact_groupsnagios
notification_interval10
notification_period24x7
notification_optionsd,u,r
}
|
Vi hostgroup.cfg
define hostgroup{
hostgroup_namehostgroups
aliashostgroups
memberslocalhost
}
|
vi service.cfg
#service definition
define service{
host_namelocalhost
service_descriptioncheck-host-alive
check_commandcheck-host-alive
max_check_attempts5
normal_check_interval3
retry_check_interval2
check_period24x7
notification_interval10
notification_period24x7
notification_optionsw,u,c,r
contact_groupsnagios
}
|
8.6 测试运行
Cd /var/www/localhost/htdocs/nagios/
Bin/nagios –v etc/nagios.cfg
测试成功的提示
Total Warnings: 0
Total Errors:0
Things look okay - No serious problems were detected during the pre-flight check otal Warnings: 0
Total Errors:0
Things look okay - No serious problems were detected during the pre-flight check Total Errors:0
Things look okay - No serious problems were detected during the pre-flight check
|
如果Total Errors不是0,根据提示修改,如果看到Total errors是0,运行下面命令,可以将它写成脚本加到开机启动里面,记得补全路径,当前路径是在相对路径下运行的。
Bin/nagios –d /etc/nagios.cfg
安装完成访问http://192.168.254.123/nagios
8.6.1 nagios测试问题
Contact group 'admins' specified in service 'Hosts' for host 'windows server' is not defined anywhere!
解决方法:
解决方法:
8.6.1.1将templates.cfg配置中的admins组更改为contactgroups.cfg中定义的nagios
8.6.2.1 或者把定义的vi objects/services.cfg 中contact_groups nagios 改为admins
8.7 nagios下windows server监控
参考文档http://yahoon.blog.51cto.com/13184/41897
实验目的:Nagios对windows server实现服务监控,如下图!

8.7.1nagios安装,在被监控客户端上
NSClient下载:http://nsclient.org/nscp/downloads这里下载的版本是0.3.8(X64)
安装,这个不用说了,Windows双击安装,安装过程需要填写Nagios服务器地址,填上你的Nagios服务器地址( 这里是192.168.254.123),密码可以填写可以不填(这里没填写),其他选项全部选中,默认安装路径c:\program files
8.7.2. NSClient配置,在被监控客户端上。
打开nsc.ini,做以下修改。
8.7.3 nagios配置文件修改,在监控服务器上
Cd /var/www/localhost/htdocs/nagios
Vi etc/object/windows.cfg基本上就是原配置,稍微做了修改
define host{
usewindows-server
host_namewindows-server
aliasMy Windows Server
address192.168.254.1
}
|
Host参数设置
define hostgroup{
hostgroup_namewindows-servers
aliasWindows Servers
}
|
组别
define service{
usegeneric-service
host_namewindows-server
service_descriptionNSClient++0.3.8;Version
check_commandcheck_nt!CLIENTVERSION
}
|
NSClient客户监控
define service{
usegeneric-service
host_namewindows-server
service_descriptionUptime
check_commandcheck_nt!UPTIME
}
|
运行时间监控
define service{
usegeneric-service
host_namewindows-server
service_descriptionCPU Load
check_commandcheck_nt!CPULOAD!-l 5,80,90
}
|
CPU监控,80%警告,90%报警
define service{
usegeneric-service
host_namewindows-server
service_descriptionMemory Usage
check_commandcheck_nt!MEMUSE!-w 80 -c 90
}
|
内存监控,80%警告,90%报警。
define service{
usegeneric-service
host_namewindows-server
service_descriptionC:\Drive Space
check_commandcheck_nt!USEDDISKSPACE!-l c -w 90 -c 95
}
|
C盘使用监控,-l后跟盘符,90%警告,95%报警。
define service{
usegeneric-service
host_namewindows-server
service_descriptionnetlogon
check_commandcheck_nt!SERVICESTATE!-d SHOWALL -l netlogon
}
|
原配置是监控W3SVC服务(IIS),测试机是我的PC,没有IIS所以改了个netlogon服务,所有的服务监控格式都是这样。
重启nagios服务
8.7.3 测试遇到的问题
8.7.3.1出现critical错误
具体提示不记得的,检查的半天结果是windows下Mcafee防火墙挡住了,换了个虚拟机的IP地址就可以了。另外跨网段也有可能出现这种问题,解决方法是修改command.cfg,在命令后添加-t 30, 默认或者不填是10
8.7.3.2 出现以下提示
NSClient - ERROR: Could not get data for 5 perhaps we don't collect data this far back?
NSClient - ERROR: Could not get value
|
解决方法:
运行CMD,进入nsclient安装路径
nsclient++/test
lodctr/r
nsclient++/test
参考资料
http://www.nsclient.org/nscp/wiki/FAQ
|
8.8 nagios下linux远程机器的监控
原理同nagios下本机监控,不同的是需要在被监控机器上安装nrpe,nagios及相关插件来监控主机,然后通过监控服务器来获取数据并显示。
首先,由于我是在VMware下测试的,为了方便直接将监控主机做了个克隆,取名Testclient ,原来监控主机是Test(默认IP地址是192.168.254.123,有 cacti+nagios+ntop全套监控软件),开机Testclient
8.8.1被监控电脑上
Vi /etc/conf.d/net修改IP地址
modules=("ifconfig")
config_eth0=("192.168.254.124 netmask 255.255.255.0 brd 192.168.254.255")
routes_eth0=("default via 192.168.254.2")
|
修改主机名称(非必要)
Hostname Testclient
添加nagios管理员的用户名密码,这是克隆的电脑可以省略。
Useradd nagios
Passwd nagios
安装nagios插件(这是克隆电脑可以省略)
cd nagios-plugins-1.4.9
./configure –prefix=/var/www/localhost/htdocs/nagios/
Make && make install
安装nrpe监控软件
Cd nrpe-2.1.3
./configure –prefix=/var/www/localhost/htdocs/nagios/
Make all
安装check_nrpe这个插件
make install-plugin
之前说过监控机需要安装check_nrpe这个插件,被监控机并不需要,我们在这里安装它是为了测试的目的
安装deamon
make install-daemon
安装配置文件
make install-daemon-config
安装xinetd脚本
make install-xinetd
编辑脚本
Vi /etc/xinetd/nrpe
service nrpe
{
flags= REUSE
socket_type= stream
port= 5666
wait= no
user= nagios
group= nagios
server= /var/www/localhost/htdocs/nagios/bin/nrpe
server_args= -c /var/www/localhost/htdocs/nagios/etc/nrpe.cfg --inetd
log_on_failure+= USERID
disable= no
only_from= 127.0.0.1 192.168.254.123
}
|
开启nrpe服务
Cd /var/www/localhost/htdocs/nagios/
Ln –s etc/nrpe nrpe
Vi nrpe
修改Allow hosts
allowed_hosts=127.0.0.1,192.168.254.123
bin/nrpe –d etc/nrpe.cfg
查看状态
Netstat –at | grep nrpe
tcp00 *:nrpe*:*LISTEN
Netstat –an | grep :5666
tcp00 0.0.0.0:56660.0.0.0:*LISTEN
OK!
添加监控命令
command[check_users]=/var/www/localhost/htdocs/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/var/www/localhost/htdocs/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/var/www/localhost/htdocs/nagios/libexec/check_disk -w 20% -c 10% -p /
command[check_zombie_procs]=/var/www/localhost/htdocs/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/var/www/localhost/htdocs/nagios/libexec/check_procs -w 150 -c 200
command[check_free_swap]=/var/www/localhost/htdocs/nagios/libexec/check_swap -w 20% -c 10%
|
具体命令写法可以用/nagios/libexec/check_nrpe –h查看,注意绿色自己监控主机上用。
重启nrpe服务
8.8.2 监控主机上
其实很简单了,同本机监控
定义hosts
define host{
host_nameLinuxClient
aliasZhengzhouPC
address192.168.254.124
check_commandcheck-host-alive
max_check_attempts5
check_period24x7
contact_groupsnagios
notification_interval10
notification_period24x7
notification_optionsd,u,r
}
|
定义服务,列出其中一个
define service{
host_nameLinuxClient
service_descriptioncheckfreeswap
check_commandcheck_nrpe!check_free_swap
max_check_attempts5
normal_check_interval3
retry_check_interval2
check_period24x7
contact_groupsnagios
notification_interval10
notification_period24x7
notification_optionsw,u,c,r
}
|
贴张大功告成图片
