nagios & powershell 分享几个简单实用的监控脚本

简介:

microsoft的SCOM的确是好东西,开源免费的监控软件的也有不少,nagios就是其中之一。出于成本和方面考虑,我们也曾用nagios来监控内部的windows和linux服务器,并以此为基础建设了内部的监控平台。另外,相对于VBS,powershell脚本算是新一代脚本了,下面介绍几个放在nsclient下运行的几个简单实用的powershell脚本。希望借此抛砖引玉,丰富下IT运维管理相关的脚本资源,我为人人,人人为我嘛。

脚本1:

“通用”的服务器关键服务状态检查。思路是,从经验来看,无论是AD、exchange还是其他什么应用的服务器,关键的服务都是开机就自动齐东路,所以根据这个“普遍”经验,也就不必要对特定的应用服务进行监控,到可以从服务启动的角度来进一步这样来判断服务器是否监控:凡事服务的启动类型为自动启动的都应该处于正常启动状态,否则服务器是不健康的。该脚本的好处就是通用,简洁!

   1:  #查看所有服务状态,当应该自动启动的服务没有启动起来即报警,.net服务例外
   2:  # To execute from within NSClient++
   3:  #
   4:  # [NRPE Handlers]
   5:  # check_exchange_mailqueue=cmd /c echo C:\Scripts\Nagios\AutoServicesHealth.ps1 | PowerShell.exe -Command -
   6:  #
   7:  # On the check_nrpe command include the -t 30, since it takes some time to load the Exchange cmdlet's.
   8:  #2014-01-22 chenyitai renew this script.
   9:  
  10:  $NagiosStatus = 0
  11:  $NagiosDescription = " "
  12:  
  13:  $service = Get-WMIObject Win32_Service | Where-Object {($_.name -notmatch "clr_optimization") -and ($_.StartMode -match "Auto")}
  14:  
  15:  foreach($AutoService in $service) 
  16:  {
  17:  if($AutoService.state -eq "Running")
  18:       {
  19:  
  20:        }
  21:  else
  22:        {
  23:        $NagiosStatus = 2
  24:        $content = $AutoService.name + " has " + $AutoService.state 
  25:        #Write-Host $content
  26:        #$NagiosDescription += $content + ", "
  27:         $NagiosDescription = $NagiosDescription + $content
  28:        #Write-Host $NagiosDescription
  29:         }
  30:   }
  31:  
  32:  if ($NagiosStatus -eq "2") 
  33:  {
  34:      Write-Host "CRITICAL: " $NagiosDescription " "
  35:  } 
  36:  
  37:  else
  38:  {
  39:      Write-Host "OK: All Auto services are Running. "
  40:  }
  41:  
  42:  exit $NagiosStatus    


脚本2:

检查关键证书是否快要到期。出于安全考虑,现在很多服务或者协议都是用SSL或者TLS来加密,这些加密是依赖服务器“本地计算机-个人”中的某个证书来加密的,比如常见的各种web服务、https,exchange  outlookanywhere,lync,NPS等等,而从我们运维的经验来看,也经常出现因为证书没有及时更新导致关键服务不能正常工作的事故,⊙﹏⊙b汗。下面这个脚本就是帮助我们监测证书的状态,会在过期前的若干天就提醒我们!

   1:  #Get-ChildItem -Path Cert:\LocalMachine\My -ExpiringInDays 10
   2:  #通过-expiringInDays获取在10天过期的计算机-个人内的证书,需要powershell 3.0
   3:  #List certificates by days until expiration
   4:  #get-childitem cert: -recurse | where-object {$_.NotAfter -gt (get-date)} | select Subject,Thumbprint,@{Name="Expires in (Days)";Expression={($_.NotAfter).subtract([DateTime]::Now).days}} | Sort "Expires in (Days)"
   5:  # To execute from within NSClient++
   6:  #
   7:  # [NRPE Handlers]
   8:  # check_exchange_mailqueue=cmd /c echo C:\Scripts\Nagios\LocalMachineMyCert.ps1 | PowerShell.exe -Command -
   9:  #
  10:  # On the check_nrpe command include the -t 30, since it takes some time to load the Exchange cmdlet's.
  11:  #2014-01-23 chenyitai renew this script.
  12:  
  13:  $NagiosStatus = 0
  14:  $NagiosDescription = " "
  15:  $today = Get-Date
  16:  
  17:  
  18:  $certs = Get-ChildItem -Path Cert:\LocalMachine\My #| Where-Object {($_.NotAfter).Subtract([datetime]::now) -lt 30}
  19:  #Write-Host $certs.Subject + $certs.NotAfter
  20:  
  21:  
  22:  
  23:  foreach ($cert in $certs)
  24:  {
  25:  if(($cert.NotAfter).Subtract([datetime]::now) -lt 10) #证书有效期小于10天则报警
  26:    {
  27:      $content = $cert.Subject + " " + $cert.SerialNumber + " " + $cert.NotAfter
  28:      $NagiosDescription += $content + ", "
  29:      $NagiosStatus = 2 # Set the status to failed.
  30:    }
  31:  }
  32:  
  33:  if ($NagiosStatus -eq "2") 
  34:  {
  35:      Write-Host "CRITICAL: " $NagiosDescription " "
  36:  } 
  37:  
  38:  else
  39:  {
  40:      Write-Host "OK: All Certificates are in validity period. "
  41:  }
  42:  
  43:  exit $NagiosStatus    

脚本3:

检查Exchange 2010 mailbox server的DAG、数据库状态是否健康。

   1:  # Test Mailbox Database and Content Index Health
   2:  # Place in C:\scripts\ folder and edit nsc.ini to call "check_mb_servername=cmd /c echo C:\Scripts\MailboxDatabaseHealth.ps1 ; exit($lastexitcode) | PowerShell.exe -Command -"
   3:  #2014-01-21 chenyitai renew this script.
   4:  
   5:  $flag1 = 0
   6:  $flag2 = 0
   7:  $NagiosDescription1 = “”
   8:  $NagiosDescription2 = “”
   9:  
  10:  if ( (Get-PSSnapin -Name Microsoft.Exchange.Management.PowerShell.E2010 -ErrorAction:SilentlyContinue) -eq $null)
  11:  {
  12:      Add-PSSnapin Microsoft.Exchange.Management.PowerShell.E2010
  13:  }
  14:  
  15:  $Status = Get-MailboxDatabaseCopyStatus -server $env:computername #获取数据库状态
  16:  
  17:  
  18:  
  19:  foreach($State in $Status){
  20:  
  21:  if(($state.status -match '^Mounted') -or ($state.status -match '^Healthy')){
  22:  
  23:      }else{
  24:          $content = $($state.name)+": "+$($state.status)
  25:          $NagiosDescription1 += $content+" , " #+=为追加写入的意思
  26:          $flag1 =1
  27:           } 
  28:  }
  29:  foreach($ContentIndexState in $Status){
  30:  
  31:  if($ContentIndexState.contentindexstate -match '^Healthy'){
  32:  
  33:      }else{
  34:          $content2 = $($ContentIndexState.name)+" Index: "+$($ContentIndexState.contentindexstate)
  35:          $NagiosDescription2 += $content2+" , "
  36:          $flag2 = 2
  37:      }
  38:      }
  39:  
  40:  $flag = $flag1 + $flag2
  41:  
  42:  if($flag -eq 0){
  43:      write-host "OK :All Databases and Indexes Are Healthy"
  44:      exit 0
  45:  } elseif ($flag -eq 1){
  46:      write-host $NagiosDescription1 
  47:      exit 2
  48:  } elseif ($flag -eq 2){
  49:      write-host "WARNING: " $NagiosDescription2
  50:      exit 1
  51:  } elseif ($flag -eq 3){
  52:      write-host "CRITICAL: " $NagiosDescription1 $NagiosDescription2
  53:      exit 2
  54:  }

脚本4:

检查Exchange 2010 hub server的队列是否过大


   1:  # Test Queue Health
   2:  # To execute from within NSClient++
   3:  #
   4:  # [NRPE Handlers]
   5:  # check_exchange_mailqueue=cmd /c echo C:\Scripts\Nagios\ExchangeQueueHealth.ps1 | PowerShell.exe -Command -
   6:  #
   7:  # On the check_nrpe command include the -t 30, since it takes some time to load the Exchange cmdlet's.
   8:  #2014-01-21 chenyitai renew this script.
   9:  
  10:  $NagiosStatus = “”
  11:  $NagiosDescription = “”
  12:  
  13:  if ( (Get-PSSnapin -Name Microsoft.Exchange.Management.PowerShell.E2010 -ErrorAction:SilentlyContinue) -eq $null)
  14:  {
  15:      Add-PSSnapin Microsoft.Exchange.Management.PowerShell.E2010
  16:  }
  17:  
  18:  $Status = Get-Queue -Server $env:computername #获取队列状态
  19:  
  20:  ForEach ($Queue in $Status)
  21:  {
  22:  
  23:  if ($Queue.MessageCount -gt "50" ) #队列中邮件计数大于50就报警
  24:    {
  25:      $content = @($Queue.Identity) + " queue has " + $Queue.MessageCount + " messages to " + $Queue.NextHopDomain #如果不加@(),执行会报错:方法调用失败,因为 [Microsoft.Exchange.Data.QueueViewer.QueueIdentity] 不包含名为“op_Addition”的方法。
  26:      $NagiosDescription += $content + ", "
  27:      $NagiosStatus = 2 # Set the status to failed.
  28:    } 
  29:  
  30:  }
  31:  
  32:  if ($NagiosStatus -eq "2") 
  33:  {
  34:      Write-Host "CRITICAL: " $NagiosDescription " "
  35:  } 
  36:  
  37:  else
  38:  {
  39:      Write-Host "OK: All mail queues within limits. "
  40:  }
  41:  
  42:  exit $NagiosStatus        

由于powershell和nagios结合比较新鲜,关于nagios下运行powershell的机制请参考:

http://nsclient.org/nscp/wiki/guides/nagios/external_scripts










本文转自 tigerkillu 51CTO博客,原文链接:http://blog.51cto.com/chenyitai/1354097,如需转载请自行联系原作者
目录
相关文章
|
6月前
|
存储 安全 Windows
PowerShell系列(六):PowerShell脚本执行策略梳理
【2月更文挑战第1篇】PowerShell 脚本执行策略用于控制何时以及何种方式执行 PowerShell 脚
|
23天前
【Azure App Service】PowerShell脚本批量添加IP地址到Web App允许访问IP列表中
Web App取消公网访问后,只允许特定IP能访问Web App。需要写一下段PowerShell脚本,批量添加IP到Web App的允许访问IP列表里!
|
1月前
|
监控 关系型数据库 MySQL
PowerShell 脚本编写 :自动化Windows 开发工作流程
PowerShell 脚本编写 :自动化Windows 开发工作流程
39 0
|
1月前
|
数据安全/隐私保护
【Azure Entra ID】使用PowerShell脚本导出Entra ID中指定应用下的所有用户信息
在Azure Entra ID中,需要导出一个Application 下的用户信息, 包含User的创建时间。
|
3月前
【Azure Web Job】Azure Web Job执行Powershell脚本报错 The term 'Select-AzContext' is not recognized as the name
【Azure Web Job】Azure Web Job执行Powershell脚本报错 The term 'Select-AzContext' is not recognized as the name
|
3月前
|
Ubuntu Linux 测试技术
【Azure Function App】Python Function调用Powershell脚本在Azure上执行失败的案例
【Azure Function App】Python Function调用Powershell脚本在Azure上执行失败的案例
|
3月前
|
存储 Shell 容器
【Azure 存储服务】使用PowerShell脚本创建存储账号(Storage Account)的共享访问签名(SASToken) : New-AzStorageContainerSASToken
【Azure 存储服务】使用PowerShell脚本创建存储账号(Storage Account)的共享访问签名(SASToken) : New-AzStorageContainerSASToken
|
3月前
【Azure 应用服务】Azure Function 中运行Powershell 脚本,定位 -DefaultProfile 引发的错误
【Azure 应用服务】Azure Function 中运行Powershell 脚本,定位 -DefaultProfile 引发的错误
|
3月前
|
Java
【Azure 应用服务】使用PowerShell脚本上传文件至App Service目录  
【Azure 应用服务】使用PowerShell脚本上传文件至App Service目录  
|
3月前
|
Java 开发工具 Windows
【Azure Developer】调用SDK的runPowerShellScript方法,在Azure VM中执行PowerShell脚本示例
【Azure Developer】调用SDK的runPowerShellScript方法,在Azure VM中执行PowerShell脚本示例
下一篇
无影云桌面