在统计项目中,最难实施的就是日志数据的收集。日志分布在全国各个机房,而且数据量比较大,像rsync+inotify这种方式显然不能满足快速日志同步的要求。 当然大家也可以用fluentd和flume采集日志数据,除了这个我们也可以自己写一套简单的。
我写的这个日志分析系统 流程是:
- 1 在客户端收集数据,然后通过redis pub方式把数据发给服务端
- 2 服务器端是redis的sub 他会把数据统一存放在一个文件,或者当前就过滤出来
客户端收集日志的更新数据
- #!/bin/bash
- DATE=`date +%s`
- LOGFILE=$1
- if [ ! -f $1 ];then
- echo "LOG file did not give or it's not a file"
- fi
- sleep_time="2"
- count_init=`wc -l ${LOGFILE}|awk '{print $1}'`
- while true
- do
- DATE_NEW=`date +%s`
- # DATE=$(date +%s)
- count_new=`wc -l ${LOGFILE}|awk '{print $1}'`
- add_count=$((${count_new} - ${count_init}))
- count_init=${count_new}
- if [ ! -n "${add_count}" ]
- then
- add_count=0
- fi
- QPS=$((${add_count}/${sleep_time}))
- info=`tail -n ${add_count} ${LOGFILE}`
- echo $info
- # 我们可以把info这个值传出去
- echo " Then QPS at `date -d "1970-01-01 UTC ${DATE_NEW} seconds" +"%Y-%m-%d %H:%M:%S"` is "${QPS}
- # echo " DATE_NEW: " $DATE_NEW " DATE_PLUS :" $DATE_PLUS
- sleep $sleep_time
- done
把实时的日志也打印出来
想传到服务端,我们只需要在脚本里面加下面这命令就ok了~
/root/redis-bash-cli -h 10.10.10.61 PUBLISH rui "$info"
redis-bash-cli 这个是客户端的脚本,可以把数据publish过去
- #!/bin/bash
- source /usr/share/redis-bash/redis-bash-lib 2> /dev/null
- if [ $? -ne 0 ]; then
- LIBFOLDER=${0%/${0##*/}}
- source ${LIBFOLDER}/redis-bash-lib 2> /dev/null
- if [ $? -ne 0 ]; then
- echo "can't find redis-bash-lib in /usr/share/redis-bash or ${LIBFOLDER}"
- exit 127
- fi
- fi
- REDISHOST=localhost
- REDISPORT=6379
- REPEAT=1
- DELAY=0
- while getopts ":h:n:p:r:a:i:" opt
- do
- case ${opt} in
- h) REDISHOST=${OPTARG};;
- n) REDISDB=${OPTARG};;
- p) REDISPORT=${OPTARG};;
- r) REPEAT=${OPTARG};;
- a) AUTH=${OPTARG};;
- i) DELAY=${OPTARG};;
- esac
- done
- shift $((${OPTIND} - 1))
- if [ "${REDISHOST}" != "" ] && [ "${REDISPORT}" != "" ]
- then
- exec 6<>/dev/tcp/${REDISHOST}/${REDISPORT} # open fd
- if [ $? -ne 0 ]; then
- exit 1
- fi
- else
- echo "Wrong arguments"
- exit 255
- fi
- [ "${AUTH}" != "" ] && redis-client 6 AUTH ${AUTH} > /dev/null
- [ "${REDISDB}" != "" ] && redis-client 6 SELECT ${REDISDB} > /dev/null
- for ((z=1;z<=${REPEAT};z++))
- do
- redis-client 6 "${@}"
- if [ $? -ne 0 ]; then
- exit 1
- fi
- [ ${DELAY} -gt 0 ] && sleep ${DELAY}
- done
- exec 6>&- #close fd
日志服务端
redis-publish-test 这个是日志服务端,可以收到publish的数据
- #!/bin/bash
- source /usr/share/redis-bash/redis-bash-lib 2> /dev/null
- if [ $? -ne 0 ]; then
- LIBFOLDER=${0%/${0##*/}}
- echo $LIBFOLDER
- source ${LIBFOLDER}/redis-bash-lib 2> /dev/null
- if [ $? -ne 0 ]; then
- echo "can't find redis-bash-lib in /usr/share/redis-bash or ${LIBFOLDER}"
- exit 127
- fi
- fi
- REDISHOST=localhost
- REDISPORT=6379
- while getopts ":h:p:" opt
- do
- case ${opt} in
- h) REDISHOST=${OPTARG};;
- p) REDISPORT=${OPTARG};;
- esac
- done
- shift $((${OPTIND} - 1))
- while true
- do
- exec 5>&-
- if [ "${REDISHOST}" != "" ] && [ "${REDISPORT}" != "" ]
- then
- exec 5<>/dev/tcp/${REDISHOST}/${REDISPORT} # open fd
- else
- echo "Wrong arguments"
- exit 255
- fi
- redis-client 5 SUBSCRIBE ${1} > /dev/null # subscribe to the pubsub channel in fd 5
- while true
- do
- unset ARGV
- OFS=${IFS};IFS=$'\n' # split the return correctly
- ARGV=($(redis-client 5))
- IFS=${OFS}
- if [ "${ARGV[0]}" = "message" ] && [ "${ARGV[1]}" = "${1}" ]
- then
- echo ${ARGV[2]}
- a=${ARGV[2]}
- echo $($a)
- echo "Message from pubsub channel: ${ARGV[2]}"
- elif [ -z ${ARGV} ]
- then
- sleep 1
- break
- fi
- done
- done
本文转自 rfyiamcool 51CTO博客,原文链接:http://blog.51cto.com/rfyiamcool/1191926,如需转载请自行联系原作者