ganglia metric extended by gmetric command line tool

简介:
上一篇文章简单的介绍了一下metric的相关知识 : 
ganglia gmond实际上是将metric collection模块化来处理的, 包括自带的metric, 也是模块化来加载的, 所以非常适合扩展.
例如 : 
[root@db-172-16-3-221 cpu]# cd /opt/ganglia-core-3.6.0/
[root@db-172-16-3-221 ganglia-core-3.6.0]# ll
total 24
drwxr-xr-x 2 root root 4096 Sep  9 11:03 bin
drwxr-xr-x 3 root root 4096 Sep 15 09:31 etc
drwxr-xr-x 2 root root 4096 Sep  9 11:03 include
drwxr-xr-x 3 root root 4096 Sep  9 11:03 lib64
drwxr-xr-x 2 root root 4096 Sep  9 11:03 sbin
drwxr-xr-x 3 root root 4096 Sep  9 11:03 share
[root@db-172-16-3-221 ganglia-core-3.6.0]# cd lib64/
[root@db-172-16-3-221 lib64]# ll
total 660
drwxr-xr-x 2 root root   4096 Sep  9 11:03 ganglia
lrwxrwxrwx 1 root root     25 Sep  9 11:03 libganglia-3.6.0.so.0 -> libganglia-3.6.0.so.0.0.0
-rwxr-xr-x 1 root root 257554 Sep  9 11:03 libganglia-3.6.0.so.0.0.0
-rw-r--r-- 1 root root 408276 Sep  9 11:03 libganglia.a
-rwxr-xr-x 1 root root   1270 Sep  9 11:03 libganglia.la
lrwxrwxrwx 1 root root     25 Sep  9 11:03 libganglia.so -> libganglia-3.6.0.so.0.0.0
[root@db-172-16-3-221 lib64]# cd ganglia/
[root@db-172-16-3-221 ganglia]# ll
total 700
-rwxr-xr-x 1 root root 86896 Sep  9 11:03 modcpu.so
-rwxr-xr-x 1 root root 84118 Sep  9 11:03 moddisk.so
-rwxr-xr-x 1 root root 17896 Sep  9 11:03 modgstatus.so
-rwxr-xr-x 1 root root 84078 Sep  9 11:03 modload.so
-rwxr-xr-x 1 root root 85832 Sep  9 11:03 modmem.so
-rwxr-xr-x 1 root root 31655 Sep  9 11:03 modmulticpu.so
-rwxr-xr-x 1 root root 84480 Sep  9 11:03 modnet.so
-rwxr-xr-x 1 root root 83862 Sep  9 11:03 modproc.so
-rwxr-xr-x 1 root root 53954 Sep  9 11:03 modpython.so
-rwxr-xr-x 1 root root 85136 Sep  9 11:03 modsys.so

在gmond的配置文件中加载模块 : 
# less /opt/ganglia-core-3.6.0/etc/gmond.conf
/* Each metrics module that is referenced by gmond must be specified and
   loaded. If the module has been statically linked with gmond, it does
   not require a load path. However all dynamically loadable modules must
   include a load path. */
modules {
  module {
    name = "core_metrics"
  }
  module {
    name = "cpu_module"
    path = "modcpu.so"
  }
  module {
    name = "disk_module"
    path = "moddisk.so"
  }
  module {
    name = "load_module"
    path = "modload.so"
  }
  module {
    name = "mem_module"
    path = "modmem.so"
  }
  module {
    name = "net_module"
    path = "modnet.so"
  }
  module {
    name = "proc_module"
    path = "modproc.so"
  }
  module {
    name = "sys_module"
    path = "modsys.so"
  }
}

当gmond自带的metric模块不能满足你的监控需求时, 你需要扩展metric collection.
扩展方法比较多, 例如直接修改gmond, 添加模块并重新编译gmond.  然后在gmond.conf中添加新增的模块, 并配置metric采集项.
或者使用c/c++, python , php, perl遵循gmond的标准来写扩展模块.  然后在gmond.conf中添加新增的模块, 并配置metric采集项.
(编译时需要开启php, perl的功能, 具体见源码里的 configure --help)
最后, 还可以使用gmetric命令直接向集群中的多播地址或单播地址发送metric信息.
显然使用gmetric来得最快, 但是使用gmetric有一个弊端, 必须用户自己调度gmetric的执行, 
而如果使用模块的话, 可以统一在gmond.conf中配置, gmond handler会根据gmond的配置来调用模块来完成metric的采集, 不需要用户去调度.
接下来要使用gmetric来测试一下自定义metric收集.

[root@db-172-16-3-221 ganglia-3.6.0]# gmetric --help
gmetric 3.6.0

The Ganglia Metric Client (gmetric) announces a metric
on the list of defined send channels defined in a configuration file

Usage: gmetric [OPTIONS]...

  -h, --help            Print help and exit
  -V, --version         Print version and exit
  -c, --conf=STRING     The configuration file to use for finding send channels 
                           (default=`/opt/ganglia-core-3.6.0/etc/gmond.conf')
  -n, --name=STRING     Name of the metric
  -v, --value=STRING    Value of the metric
  -t, --type=STRING     Either 
                          string|int8|uint8|int16|uint16|int32|uint32|float|double
  -u, --units=STRING    Unit of measure for the value e.g. Kilobytes, Celcius  
                          (default=`')
  -s, --slope=STRING    Either zero|positive|negative|both  (default=`both')
  -x, --tmax=INT        The maximum time in seconds between gmetric calls  
                          (default=`60')
  -d, --dmax=INT        The lifetime in seconds of this metric  (default=`0')
  -g, --group=STRING    Group(s) of the metric (comma-separated)
  -C, --cluster=STRING  Cluster of the metric
  -D, --desc=STRING     Description of the metric
  -T, --title=STRING    Title of the metric
  -S, --spoof=STRING    IP address and name of host/device (colon separated) we 
                          are spoofing  (default=`')
  -H, --heartbeat       spoof a heartbeat message (use with spoof option)

首先在集群中要有gmond监听来自网络中的metric消息.
例如我们网络中只有一个gmond服务, 并在多播地址上起了一个监听来接收网络中发送过来的metric信息.

[root@db-172-16-3-221 ganglia-3.6.0]# ps -ewf|grep gmond
nobody    4271     1  0 15:51 ?        00:00:00 gmond
root      4930 26259  0 16:54 pts/2    00:00:00 grep gmond
[root@db-172-16-3-221 ganglia-3.6.0]# netstat -anp|grep gmon
tcp        0      0 0.0.0.0:8649                0.0.0.0:*                   LISTEN      4271/gmond          
udp        0      0 172.16.3.221:63894          239.2.11.71:8649            ESTABLISHED 4271/gmond          
udp        0      0 239.2.11.71:8649            0.0.0.0:*                               4271/gmond 

接下来我在多播域中的另一台机器上配置一下gmond.conf
主要配置如下, 务必配置正确的udp发送通道.

[root@150 etc]# vi /opt/ganglia-core-3.6.0/etc/gmond.conf
/* This configuration is as close to 2.5.x default behavior as possible
   The values closely match ./gmond/metric.h definitions in 2.5.x */
globals {
  daemonize = yes
  setuid = yes
  user = nobody
  debug_level = 0
  max_udp_msg_len = 1472
  mute = no
  deaf = no
  allow_extra_data = yes
  host_dmax = 86400 /*secs. Expires (removes from web interface) hosts in 1 day */
  host_tmax = 20 /*secs */
  cleanup_threshold = 300 /*secs */
  gexec = no
  # By default gmond will use reverse DNS resolution when displaying your hostname
  # Uncommeting following value will override that value.
  # override_hostname = "mywebserver.domain.com"
  # If you are not using multicast this value should be set to something other than 0.
  # Otherwise if you restart aggregator gmond you will get empty graphs. 60 seconds is reasonable
  send_metadata_interval = 0 /*secs */

}

/*
 * The cluster attributes specified will be used as part of the <CLUSTER>
 * tag that will wrap all hosts collected by this instance.
 */
cluster {
  name = "test"
  owner = "digoal"
  latlong = "111 122"
  url = "http://dba.sky-mobi.com"
}

/* The host section describes attributes of the host, like the location */
host {
  location = "1,2,4"
}

/* Feel free to specify as many udp_send_channels as you like.  Gmond
   used to only support having a single channel */
udp_send_channel {
  #bind_hostname = yes # Highly recommended, soon to be default.
                       # This option tells gmond to use a source address
                       # that resolves to the machine's hostname.  Without
                       # this, the metrics may appear to come from any
                       # interface and the DNS names associated with
                       # those IPs will be used to create the RRDs.
  mcast_join = 239.2.11.71
  port = 8649
  ttl = 10
}

主机名和IP如下 : 
[root@150 etc]# hostname
150.sky-mobi.com
[root@150 etc]# ifconfig
em1       Link encap:Ethernet  HWaddr 00:22:19:60:77:8F  
          inet6 addr: fe80::222:19ff:fe60:778f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:599238829 errors:0 dropped:0 overruns:0 frame:0
          TX packets:67014382 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:882536084904 (821.9 GiB)  TX bytes:8841108757 (8.2 GiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:81579301 errors:0 dropped:0 overruns:0 frame:0
          TX packets:81579301 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:15164677095 (14.1 GiB)  TX bytes:15164677095 (14.1 GiB)

ovirtmgmt Link encap:Ethernet  HWaddr 00:22:19:60:77:8F  
          inet addr:172.16.3.150  Bcast:172.16.3.255  Mask:255.255.255.0
          inet6 addr: fe80::222:19ff:fe60:778f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:93718326 errors:0 dropped:0 overruns:0 frame:0
          TX packets:86213477 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1098952054073 (1023.4 GiB)  TX bytes:11330242206 (10.5 GiB)

ovirtmgmt:1 Link encap:Ethernet  HWaddr 00:22:19:60:77:8F  
          inet addr:172.16.13.150  Bcast:172.16.13.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

首先要启动gmond, (当然, 不启动也可以, 启动gmond除了可以发心跳包, 还可以将location等基础信息发过去)
Usage: gmetric [OPTIONS]...

  -h, --help            Print help and exit
  -V, --version         Print version and exit
  -c, --conf=STRING     The configuration file to use for finding send channels 
                           (default=`/opt/ganglia-core-3.6.0/etc/gmond.conf')
  -n, --name=STRING     Name of the metric
  -v, --value=STRING    Value of the metric
  -t, --type=STRING     Either 
                          string|int8|uint8|int16|uint16|int32|uint32|float|double
  -u, --units=STRING    Unit of measure for the value e.g. Kilobytes, Celcius  
                          (default=`')
  -s, --slope=STRING    Either zero|positive|negative|both  (default=`both')
  -x, --tmax=INT        The maximum time in seconds between gmetric calls  
                          (default=`60')
  -d, --dmax=INT        The lifetime in seconds of this metric  (default=`0')
  -g, --group=STRING    Group(s) of the metric (comma-separated)
  -C, --cluster=STRING  Cluster of the metric
  -D, --desc=STRING     Description of the metric
  -T, --title=STRING    Title of the metric
  -S, --spoof=STRING    IP address and name of host/device (colon separated) we 
                          are spoofing  (default=`')
  -H, --heartbeat       spoof a heartbeat message (use with spoof option)
[root@150 etc]# gmetric -c /opt/ganglia-core-3.6.0/etc/gmond.conf -n load_one -v 1.0 -t float -s both -x 20 -g load -C test -D "load per min" -T "load one"

例如在172.16.3.150模拟发送172.16.3.151的metric信息 : 
[root@150 etc]# gmetric -c /opt/ganglia-core-3.6.0/etc/gmond.conf -n load_one -v 1.0 -t float -s both -x 20 -g load -C test -D "load per min" -T "load one" -S "172.16.3.151:digoal_host"

不指定 -S "172.16.3.151:digoal_host"的话, 就是以当前主机的metric来发送.

如果不启动gmond的话, 那么需要发心跳包过去.

[root@150 etc]# gmetric -c /opt/ganglia-core-3.6.0/etc/gmond.conf -n load_one -v 1.0 -t float -s both -x 20 -g load -C test -D "load per min" -T "load one" -S "172.16.3.151:digoal_host" -H

发完心跳包后 : 
ganglia metric extended by gmetric command line tool - 德哥@Digoal - PostgreSQL research

[注意]
1. gmond 负责监听的节点, 必须开放防火墙的访问, 例如 : 

[root@db-172-16-3-221 kernel-doc-2.6.32]# netstat -anp|grep gmond
tcp        0      0 0.0.0.0:8649                0.0.0.0:*                   LISTEN      4271/gmond          
udp        0      0 172.16.3.221:63894          239.2.11.71:8649            ESTABLISHED 4271/gmond          
udp        0      0 239.2.11.71:8649            0.0.0.0:*                               4271/gmond 

务必允许其他主机访问这个多播地址.

[参考]
目录
相关文章
|
索引
问题复盘:Kibana did not load properly. Check the server output for more information
问题复盘:Kibana did not load properly. Check the server output for more information
430 0
问题复盘:Kibana did not load properly. Check the server output for more information
Distributed Port Scanning: Creating an Nmap Cluster Using DNmap
http://raidersec.blogspot.tw/2013/01/distributed-port-scanning-creating-nmap.
874 0
|
Shell Linux
15 Examples To Master Linux Command Line History
When you are using Linux command line frequently, using the history effectively can be a major productivity boost.
546 0