<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont

简介: 我们以前使用过的对hbase和hdfs进行健康检查,及剩余hdfs容量告警,简单易用1.针对hadoop2的脚本:#/bin/bashbin=`dirname $0`bin=`cd $bin;pwd`STATE_OK=...

我们以前使用过的对hbase和hdfs进行健康检查,及剩余hdfs容量告警,简单易用

1.针对hadoop2的脚本:

#/bin/bash


bin=`dirname $0`
bin=`cd $bin;pwd`


STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
STATE_DEPENDENT=4


source /etc/profile


DFS_REMAINING_WARNING=15
DFS_REMAINING_CRITICAL=5
ABNORMAL_QUERY="INCONSISTENT|CORRUPT|FAILED|Exception"


HADOOP_WEB_INTERFACE=h001.hadoop
HBASE_WEB_INTERFACE=h008.hadoop
# hbck and fsck report
output=/var/log/cluster-status
hbase hbck >> $output
hadoop fsck /apps/hbase >> $output


# check report
count=`egrep -c "$ABNORMAL_QUERY" $output`
if [ $count -eq 0 ]; then
echo "[OK] Cluster is healthy." >> $output
else
echo "[ABNORMAL] Cluster is abnormal!" >> $output


# Get the last matching entry in the report file
last_entry=`egrep "$ABNORMAL_QUERY" $output | tail -1`
echo "($count) $last_entry"


exit $STATE_CRITICAL
fi



# HDFS usage
dfs_remaining=`curl -s http://${HADOOP_WEB_INTERFACE}:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo |egrep -o "PercentRemaining.*" | egrep -o "[0-9]*\.[0-9]*"`
dfs_remaining_word="DFS Remaining%: ${dfs_remaining}%"


echo "$dfs_remaining_word" >> $output


# check HDFS usage
dfs_remaining=`echo $dfs_remaining | awk -F '.' '{print $1}'`


if [ $dfs_remaining -lt $DFS_REMAINING_CRITICAL ]; then
echo "Low DFS space. $dfs_remaining_word"
exit_status=$STATE_CRITICAL
elif [ $dfs_remaining -lt $DFS_REMAINING_WARNING ]; then
echo "Low DFS space. $dfs_remaining_word"
exit_status=$STATE_WARNING
else
echo "HBase check OK - DFS and HBase healthy. 
$dfs_remaining_word"
exit_status=$STATE_OK
fi
exit $exit_status







2.针对hadoop1的脚本:


#/bin/bash


bin=`dirname $0`
bin=`cd $bin;pwd`


STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
STATE_DEPENDENT=4

source /etc/profile

DFS_REMAINING_WARNING=15
DFS_REMAINING_CRITICAL=5
ABNORMAL_QUERY="INCONSISTENT|CORRUPT|FAILED|Exception"

HADOOP_WEB_INTERFACE= hadoop的Namenode对外接口ip

# hbck and fsck report
output=/data/logs/cluster-status
$HBASE_HOME/bin/hbase hbck >> $output
$HADOOP_HOME/bin/hadoop fsck /hbase >> $output


# check report
count=`egrep -c "$ABNORMAL_QUERY" $output`
if [ $count -eq 0 ]; then
echo "[OK] Cluster is healthy." >> $output
else
echo "[ABNORMAL] Cluster is abnormal!" >> $output


# Get the last matching entry in the report file
last_entry=`egrep "$ABNORMAL_QUERY" $output | tail -1`
echo "($count) $last_entry"

exit $STATE_CRITICAL
fi


# Check RegionServer Status
dead_region_servers=`curl -s http://${HADOOP_WEB_INTERFACE}:60010/master-status | grep "Dead Region Servers" -A 500 | grep "Regions in Transition" -B 500 | egrep -o 'target="_blank">.*</a>' | awk -F">" '{print $2}' | awk -F"<" '{print $1}'`
if [ -z $dead_region_servers ];then
echo "[OK] All RegionServers is healthy." 
echo "[OK] All RegionServers is healthy." >> $output
else
echo "[ABNORMAL] the dead regionserver list:" >> $output
echo $dead_region_servers >> $output
exit $STATE_CRITICAL
fi


# HDFS usage
dfs_remaining=`curl -s http://${HADOOP_WEB_INTERFACE}:50070/dfshealth.jsp |egrep -o "DFS Remaining%.*%" | egrep -o "[0-9]*\.[0-9]*"`
dfs_remaining_word="DFS Remaining%: ${dfs_remaining}%"


echo "$dfs_remaining_word" >> $output


# check HDFS usage
dfs_remaining=`echo $dfs_remaining | awk -F '.' '{print $1}'`


if [ $dfs_remaining -lt $DFS_REMAINING_CRITICAL ]; then
echo "Low DFS space. $dfs_remaining_word"
exit_status=$STATE_CRITICAL
elif [ $dfs_remaining -lt $DFS_REMAINING_WARNING ]; then
echo "Low DFS space. $dfs_remaining_word"
exit_status=$STATE_WARNING
else
echo "HBase check OK - DFS and HBase healthy. 
$dfs_remaining_word"
exit_status=$STATE_OK
fi
exit $exit_status
目录
相关文章
|
Web App开发 前端开发 Android开发
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
使用MAT分析内存泄露 对于大型服务端应用程序来说,有些内存泄露问题很难在测试阶段发现,此时就需要分析JVM Heap Dump文件来找出问题。
959 0
|
Web App开发 监控 前端开发
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
系统的升级涉及各个架构组件,细节很多。常年累月的修修补补使老系统积累了很多问题。 系统升级则意味着需要repair之前埋下的雷,那为何还要升级,可以考虑以下几个方面 成熟老系统常见问题: 1. 缺乏文档(这应该是大小公司都存在的问题。
741 0
|
Web App开发 前端开发
|
Web App开发 前端开发
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
PipeMapRed.waitOutputThreads(): subprocess failed with code X ,这里code X对应的信息如下:error code 1: Operation not perm...
1119 0
|
Web App开发 前端开发 算法
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
基于大数据的精准营销与应用场景 2015年08月11日 大数据 大数据营销时代来临营销学领域过去半个多世纪的发展让我们见证了从“以产品为中心”到“以客户为中心”的转变。
1099 0
|
Web App开发 前端开发
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
service cloudera-scm-agent stop service cloudera-scm-agent stop umount /var/run/cloudera-scm-agent/process umo...
901 0
|
SQL Web App开发 前端开发
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
     如果在INSERT语句末尾指定了ON DUPLICATE KEY UPDATE,并且插入行后会导致在一个UNIQUE索引或PRIMARY KEY中出现重复值,则在出现重复值的行执行UPDATE;如果不会导致唯一值列重复的问题,则插入新行。
907 0
|
Web App开发 监控 前端开发
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
使用hive分析日志作业很多的时候,需要修改mysql的默认连接数 修改方法   打开/etc/my.cnf文件 在[mysqld]  中添加 max_connections=1000 重启mysql服务  service mysqld restart mysql>show variables like '%max_connections%'; 查看当前mysql的连接数方法 mysqladmin -uroot -p status 其中,Uptime:mysqld运行时间,单位秒。
778 0
|
Web App开发 前端开发
|
Web App开发 前端开发
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html><head><meta http-equiv="Cont
1.使用lsmod查看ipv6的模块是否被加载。 lsmod | grep ipv6 [root@dmhadoop011 ~]# lsmod | grep ipv6 ipv6                  317340  127 bonding 如果加载了,则进行如下操作: 2.
918 0

热门文章

最新文章