logstash日志收集分析系统elasticsearch&kibana-阿里云开发者社区

logstash日志收集分析系统

此文版本已较老，请移步http://bbotte.com/查看

Logstash provides a powerful pipeline for storing, querying, and analyzing your logs. When using Elasticsearch as a backend data store and Kibana as a frontend reporting tool, Logstash acts as the workhorse. It includes an arsenal of built-in inputs, filters, codecs, and outputs, enabling you to harness some powerful functionality with a small amount of effort.

http://semicomplete.com/files/logstash/ logstash收集日志，需要java平台

logstash-1.4.2.tar.tar jdk-7u67-linux-x64.rpm
http://www.elasticsearch.org/overview/elkdownloads  elasticsearch搜索引擎，此页面有帮助文档
http://www.elasticsearch.org/overview/kibana/installation/    Kibana提供web界面
http://redis.io/download                              redis redis-2.8.19.tar.gz
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-plugins.html elasticsearch插件

https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns logstash正则

帮助文档
http://www.elasticsearch.org/guide/
http://logstash.net/docs/1.4.2/
https://github.com/elasticsearch/kibana/blob/master/README.md

http://www.elasticsearch.cn/

http://logstash.es/

http://www.elastic.co/

http://kibana.logstash.es/content/

http://shgy.gitbooks.io/mastering-elasticsearch/content/

系统：CentOS 6.5 64位

所安装的软件包：

jdk-7u67-linux-x64.rpm

redis-2.8.19.tar.gz

logstash-1.4.2.tar.tar

elasticsearch-1.4.2.zip#请安装新的版本 1.4.4（修复了漏洞）,logstash和elasticsearch的版本最好一致

kibana-3.1.2.zip

#安装java和redis

 
         # rpm -ivh jdk-7u67-linux-x64.rpm 
        
         # /usr/java/jdk1.7.0_67/bin/java -version 
        
         # vim ~/.bashrc 
        
         export 
         JAVA_HOME=
         /usr/java/jdk1
         .7.0_67 
        
         export 
         JRE_HOME=${JAVA_HOME}
         /jre 
        
         export 
         CLASSPATH=.:${JAVA_HOME}
         /lib
         :${JRE_HOME}
         /lib 
        
         export 
         PATH=${JAVA_HOME}
         /bin
         :$PATH 
        
         # . ~/.bashrc 
        
         # java -version                                                 #验证java 
        
         java version 
         "1.7.0_67" 
        
         Java(TM) SE Runtime Environment (build 1.7.0_67-b01) 
        
         Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) 
        
         # tar -xzf redis-2.8.19.tar.gz 
        
         # cd redis-2.8.19 
        
         # make 
        
         # make install 
        
         # ./utils/install_server.sh 
        
         Port           : 6379 
        
         Config 
         file    
         : 
         /etc/redis/6379
         .conf 
        
         Log 
         file       
         : 
         /var/log/redis_6379
         .log 
        
         Data 
         dir       
         : 
         /var/lib/redis/6379 
        
         Executable     : 
         /usr/local/bin/redis-server 
        
         Cli Executable : 
         /usr/local/bin/redis-cli 
        
         # service redis_6379 restart                                    #启动redis 
        
         # redis-cli ping

#安装logstash和elasticsearch

 
         # mkdir /var/www/logstash 
        
         # unzip elasticsearch-1.4.2.zip -d /var/www/logstash 
        
         # cd /var/www/logstash 
        
         # ln -s elasticsearch-1.4.2/ elasticsearch 
        
         # cd elasticsearch 
        
         # ./bin/elasticsearch -f                                        #启动elasticsearch，默认配置文件 
        
         getopt: invalid option -- 
         'f' 
        
         [2015-02-09 16:15:24,502][INFO ][node                     ] [Amergin] version[1.4.2], pid[4718], build[927caff
         /2014-12-16T14
         :11:12Z] 
        
         [2015-02-09 16:15:24,502][INFO ][node                     ] [Amergin] initializing ... 
        
         [2015-02-09 16:15:24,518][INFO ][plugins                  ] [Amergin] loaded [], sites [] 
        
         [2015-02-09 16:15:27,945][INFO ][node                     ] [Amergin] initialized 
        
         [2015-02-09 16:15:27,945][INFO ][node                     ] [Amergin] starting ... 
        
         [2015-02-09 16:15:28,232][INFO ][transport                ] [Amergin] bound_address {inet[
         /0
         :0:0:0:0:0:0:0:9300]}, publish_address {inet[
         /192
         .168.10.1:9300]} 
        
         [2015-02-09 16:15:28,300][INFO ][discovery                ] [Amergin] elasticsearch
         /mvrxUfixSPKQKzb3s_nFug 
        
         [2015-02-09 16:15:32,091][INFO ][cluster.service          ] [Amergin] new_master [Amergin][mvrxUfixSPKQKzb3s_nFug][manager][inet[
         /192
         .168.10.1:9300]], reason: zen-disco-
         join 
         (elected_as_master) 
        
         [2015-02-09 16:15:32,143][INFO ][http                     ] [Amergin] bound_address {inet[
         /0
         :0:0:0:0:0:0:0:9200]}, publish_address {inet[
         /192
         .168.10.1:9200]} 
        
         [2015-02-09 16:15:32,143][INFO ][node                     ] [Amergin] started 
        
         [2015-02-09 16:15:32,162][INFO ][gateway                  ] [Amergin] recovered [0] indices into cluster_state 
        
         # curl -X GET http://localhost:9200                            #也可以在浏览器打开http://192.168.10.1:9200/ 
        
         { 
        
         "status" 
         : 200, 
        
         "name" 
         : 
         "Amergin"
         , 
        
         "cluster_name" 
         : 
         "elasticsearch"
         , 
        
         "version" 
         : { 
        
         "number" 
         : 
         "1.4.2"
         , 
        
         "build_hash" 
         : 
         "927caff6f05403e936c20bf4529f144f0c89fd8c"
         , 
        
         "build_timestamp" 
         : 
         "2014-12-16T14:11:12Z"
         , 
        
         "build_snapshot" 
         : 
         false
         , 
        
         "lucene_version" 
         : 
         "4.10.2" 
        
         }, 
        
         "tagline" 
         : 
         "You Know, for Search" 
        
         } 
        
         # tar -xzf logstash-1.4.2.tar.tar 
        
         # cd logstash-1.4.2 
        
         # ./bin/logstash -h                                             #查看帮助

#下面是测试，查看logstash的运行原理

 
         # echo "`date` hello world" 
        
         Mon Feb  9 16:36:15 CST 2015 hello world 
        
         #测试logstash的stdin，stdout，如下： 
        
         # bin/logstash -e 'input { stdin { } } output { stdout {} }'     
        
         Mon Feb  9 16:36:15 CST 2015 hello world                        
         #输入这一行，直接粘贴，不要手动输入 
        
         2015-02-09T08:36:23.190+0000 manager Mon Feb  9 16:36:15 CST 2015 hello world  
         #显示logstash处理后的数据 
        
         #测试logstash的stdin，stdout在elasticsearch处理后的数据显示，如下： 
        
         # /var/www/logstash/elasticsearch/bin/elasticsearch -f          #同时启动elasticsearch 
        
         # bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } }' 
        
         you know, 
         for 
         logs                                              
         #输入这一行 
        
         # curl 'http://localhost:9200/_search?pretty'                   #显示elasticsearch处理后的数据 
        
         { 
        
         "took" 
         : 64, 
        
         "timed_out" 
         : 
         false
         , 
        
         "_shards" 
         : { 
        
         "total" 
         : 5, 
        
         "successful" 
         : 5, 
        
         "failed" 
         : 0 
        
         }, 
        
         "hits" 
         : { 
        
         "total" 
         : 1, 
        
         "max_score" 
         : 1.0, 
        
         "hits" 
         : [ { 
        
         "_index" 
         : 
         "logstash-2015.02.09"
         , 
        
         "_type" 
         : 
         "logs"
         , 
        
         "_id" 
         : 
         "IFmPqi0dQjSNZR5-94NuHg"
         , 
        
         "_score" 
         : 1.0, 
        
         "_source"
         :{
         "message"
         :
         "you know, for logs"
         ,
         "@version"
         :
         "1"
         ,
         "@timestamp"
         :
         "2015-02-09T08:48:48.747Z"
         ,
         "host"
         :
         "manager"
         } 
        
         } ] 
        
         } 
        
         } 
        
         #You’ve successfully stashed logs in Elasticsearch via Logstash

#安装elasticsearch插件，测试一下

 
         # cd /var/www/logstash/elasticsearch/bin/                     #安装kopf插件 
        
         # ./plugin -install lmenezes/elasticsearch-kopf 
        
         #下面测试这个kopf插件 
        
         # /var/www/logstash/elasticsearch/bin/elasticsearch -f 
        
         # bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } stdout { } }' 
        
         hello world 
        
         2015-02-09T09:07:35.590+0000 manager hellhello world 
        
         hello logstash 
        
         2015-02-09T09:09:26.981+0000 manager hello logstash 
        
         # curl 'http://localhost:9200/_search?pretty'                 #会看到刚才输出一定格式log文件 
        
         # curl 'http://localhost:9200/_plugin/kopf/'                  #显示插件的页面，不过这个看不到东西 
        
         #在浏览器访问192.168.10.1:9200/_plugin/kopf/ 会打开如下界面,浏览保存在Elasticsearch中的数据，设置及映射

还有好多很不错的插件，都可以安装上去：

es_head: 这个主要提供的是健康状态查询，当然标签页里也提供了简单的form给你提交API请求。es_head现在可以直接通过 elasticsearch/bin/plugin -install mobz/elasticsearch-head 安装，然后浏览器里直接输入http://$eshost:9200/_plugin/head/ 就可以看到cluster/node/index/shards的状态了

bigdesk: 这个主要提供的是节点的实时状态监控，包括jvm的情况，linux的情况，elasticsearch的情况。排查性能问题的时候很有用，现在也可以过 elasticsearch/bin/plugin -install lukas-vlcek/bigdesk 直接安装了。然后浏览器里直接输入 http://$eshost:9200/_plugin/bigdesk/ 就可以看到了。注意如果使用的 bulk_index 的话，如果选择的刷新间隔太长，indexing per second数据是不准的

#elasticsearch处理日志

 
         # /var/www/logstash/elasticsearch/bin/elasticsearch -d /var/run/elasticsearch.pid #启动elasticsearch 
        
         #logstash对apache的错误日志处理，如下： 
        
         # vi logstash-apache.conf  
        
         input { 
        
         file 
         { 
        
         path => 
         "/var/log/httpd/error_log" 
        
         start_position => beginning 
        
         } 
        
         } 
        
         filter { 
        
         if 
         [path] =~ 
         "error" 
         { 
        
         mutate { replace => { 
         "type" 
         => 
         "apache_error" 
         } } 
        
         grok { 
        
         match => { 
         "message" 
         => 
         "%{COMBINEDAPACHELOG}" 
         } 
        
         } 
        
         } 
        
         date 
         { 
        
         match => [ 
         "timestamp" 
         , 
         "dd/MMM/yyyy:HH:mm:ss Z" 
         ] 
        
         } 
        
         } 
        
         output { 
        
         elasticsearch { 
        
         host => localhost 
        
         } 
        
         stdout { codec => rubydebug } 
        
         } 
        
         # bin/logstash -f logstash-apache.conf                       
        
         #稍等二十秒，如果没有输出，那么vim 这个日志，到最后面复制再粘贴一行，模拟写入日志 
        
         #此时logstash会读apache的错误日志，在下面命令行会显示，http://192.168.10.1:9200/_search?pretty 浏览器页面也会看到

继续测试

 
         #logstash对apache日志的处理，如下： 
        
         # vi logstash-apache.conf  
        
         input { 
        
         file 
         { 
        
         path => 
         "/var/log/httpd/*_log" 
        
         } 
        
         } 
        
         filter { 
        
         if 
         [path] =~ 
         "access" 
         { 
        
         mutate { replace => { 
         type 
         => 
         "apache_access" 
         } } 
        
         grok { 
        
         match => { 
         "message" 
         => 
         "%{COMBINEDAPACHELOG}" 
         } 
        
         } 
        
         date 
         { 
        
         match => [ 
         "timestamp" 
         , 
         "dd/MMM/yyyy:HH:mm:ss Z" 
         ] 
        
         } 
        
         } 
         else 
         if 
         [path] =~ 
         "error" 
         { 
        
         mutate { replace => { 
         type 
         => 
         "apache_error" 
         } } 
        
         } 
         else 
         { 
        
         mutate { replace => { 
         type 
         => 
         "random_logs" 
         } } 
        
         } 
        
         } 
        
         output { 
        
         elasticsearch { host => localhost } 
        
         stdout { codec => rubydebug } 
        
         } 
        
         # bin/logstash -f logstash-apache.conf

 
         说明： 
        
         事件的生命周期 
        
         Inputs,Outputs,Codecs,Filters构成了Logstash的核心配置项。Logstash通过建立一条事件处理的管道，从你的日志提取出数据保存到Elasticsearch中，为高效的查询数据提供基础。 
        
         Inputs 
        
         input 及输入是指日志数据传输到Logstash中。其中常见的配置如下：  
        
         file：从文件系统中读取一个文件，很像UNIX命令 
         "tail -0a"  
        
         syslog：监听514端口，按照RFC3164标准解析日志数据 
        
         redis：从redis服务器读取数据，支持channel(发布订阅)和list模式。redis一般在Logstash消费集群中作为
         "broker"
         角色，保存events队列共Logstash消费。 
        
         Filters 
        
         Fillters在Logstash处理链中担任中间处理组件。他们经常被组合起来实现一些特定的行为来，处理匹配特定规则的事件流。常见的filters如下： grok：解析无规则的文字并转化为有结构的格式。Grok是目前最好的方式来将无结构的数据转换为有结构可查询的数据。有120多种匹配规则，会有一种满足你的需要。 
        
         mutate：mutate filter 允许改变输入的文档，你可以从命名，删除，移动或者修改字段在处理事件的过程中。 
        
         drop：丢弃一部分events不进行处理，例如：debug events。 
        
         clone：拷贝event，这个过程中也可以添加或移除字段。 
        
         geoip：添加地理信息(为前台kibana图形化展示使用)  
        
         Outputs 
        
         outputs是logstash处理管道的最末端组件。一个event可以在处理过程中经过多重输出，但是一旦所有的outputs都执行结束，这个event也就完成生命周期。一些常用的outputs包括： elasticsearch：如果你计划将高效的保存数据，并且能够方便和简单的进行查询 
        
         file：将event数据保存到文件中 
        
         graphite：将event数据发送到图形化组件中，一个很流行的开源存储图形化展示的组件。http://graphite.wikidot.com/ 
        
         statsd：statsd是一个统计服务，比如技术和时间统计，通过udp通讯，聚合一个或者多个后台服务，如果你已经开始使用statsd，该选项对你应该很有用  
        
         Codecs 
        
         codecs 是基于数据流的过滤器，它可以作为input，output的一部分配置。Codecs可以帮助你轻松的分割发送过来已经被序列化的数据。流行的codecs包括 json,msgpack,plain(text)。  
        
         json：使用json格式对数据进行编码/解码 
        
         multiline：将汇多个事件中数据汇总为一个单一的行。比如：java异常信息和堆栈信息 获取完整的配置信息，请参考 Logstash文档中 
         "plugin configuration"
         部分

#上面已经很清楚的说明了logstash的工作模式，下面就结合kibana在页面查看
#kibaba，在logstash里面已经集成了kibana，在vendor/kibana/这个目录里面，当然你也可以下载 kibana-3.1.2.zip 然后解压

 
         # unzip kibana-3.1.2.zip -d /var/www/logstash/kibana 
        
         # ln -s /var/www/logstash/kibana/kibana-3.1.2 /var/www/logstash/kibana/kibana 
        
         # vim /var/www/logstash/kibana/kibana/config.js 
        
         32 /*    elasticsearch: 
         "http://"
         +window.location.
         hostname
         +
         ":9200"
         , 
        
         33 */ 
        
         34     elasticsearch: 
         "http://192.168.10.1:9200"
         , 
        
         # vim /etc/httpd/conf.d/kibana.conf 
        
         <VirtualHost *:80> 
        
         DocumentRoot 
         /var/www/logstash/kibana/kibana 
        
         ServerName 192.168.10.1 
        
         <Directory 
         "/var/www/logstash/kibana/kibana"
         > 
        
         Options FollowSymLinks 
        
         AllowOverride None 
        
         Order allow,deny 
        
         Allow from all 
        
         php_value max_execution_time 300 
        
         php_value memory_limit 128M 
        
         php_value post_max_size 16M 
        
         php_value upload_max_filesize 2M 
        
         php_value max_input_time 300 
        
         php_value 
         date
         .timezone Asia
         /Shanghai 
        
         <
         /Directory
         > 
        
         <
         /VirtualHost
         > 
        
         # vim logstash.conf 
        
         input { 
        
         file 
         { 
        
         type 
         => 
         "syslog" 
        
         #    path => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ] 
        
         path => [ 
         "/var/log/messages"
         , 
         "/var/log/syslog" 
         ] 
        
         sincedb_path => 
         "/var/sincedb" 
        
         } 
        
         redis { 
        
         host => 
         "192.168.10.1" 
        
         type 
         => 
         "redis-input" 
        
         data_type => 
         "list" 
        
         key => 
         "logstash" 
        
         } 
        
         syslog { 
        
         type 
         => 
         "syslog" 
        
         port => 
         "5544" 
        
         } 
        
         } 
        
         filter { 
        
         grok { 
        
         type 
         => 
         "syslog" 
        
         match => [ 
         "message"
         , 
         "%{SYSLOGBASE2}" 
         ] 
        
         add_tag => [ 
         "syslog"
         , 
         "grokked" 
         ] 
        
         } 
        
         } 
        
         output { 
        
         elasticsearch { host => 
         "192.168.10.1" 
         } 
        
         } 
        
         # service httpd restart 
        
         # vim /etc/redis/6379.conf 
        
         bind 192.168.10.1     
        
         # service redis_6379 restart 
        
         # ps aux|grep redis|grep -v grep 
        
         root      8340  0.1  0.7  40536  7448 ?        Ssl  07:23   0:00 
         /usr/local/bin/redis-server 
         192.168.10.1:6379  
        
         # vim /var/www/logstash/elasticsearch/config/elasticsearch.yml 
        
         http.cors.enabled: 
         true                                         
         #添加此行 
        
         #参考https://github.com/elastic/kibana/issues/1637 
        
         # /var/www/logstash/elasticsearch/bin/elasticsearch -d /var/run/elasticsearch.pid  #服务也重启下 
        
         # ./bin/logstash --configtest -f logstash.conf                  #测试配置文件 
        
         Configuration OK 
        
         # ./bin/logstash -v -f logstash.conf & 
        
         #服务都启动后，在浏览器打开 http://192.168.10.1即可显示Kibana的默认页面

当有日志写入的时候，http://192.168.10.1/index.html#/dashboard/file/guided.json 页面相应的数据即随着变动，下一步就是研究elasticsearch搜索

elasticsearch存储的分析日志目录：

/var/www/logstash/elasticsearch/data/elasticsearch/nodes/0/indices

本文转自 bbotte 51CTO博客，原文链接：http://blog.51cto.com/bbotte/1613571，如需转载请自行联系原作者

logstash日志收集分析系统elasticsearch&kibana

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件