前言
阿里已经正式开源了可观测数据采集器iLogtail。作为阿里内部可观测数据采集的基础设施,iLogtail承载了阿里巴巴集团、蚂蚁的日志、监控、Trace、事件等多种可观测数据的采集工作。本文将介绍iLogtail 如何采集Prometheus exporter 数据。
采集配置
iLogtail 的采集配置全面兼容Prometheus 配置文件(以下介绍为1.0.30版本+)。
以下是一个简单的prometheus 采集配置。
{
"inputs":[
{
"detail":{
"Yaml":"global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["exporter:18080"]"
},
"type":"service_prometheus"
}
]
}
采集数据格式
iLogtail Prometheus 采集的Metrics 数据与日志同样遵循iLogtail 的传输层协议,目前传输数据字段为以下格式。
E2E 快速上手
目前iLogtail 已经集成了prometheus 的E2E测试,可以在iLogtail 的根路径快速进行上手验证。
测试命令:TEST_SCOPE=input_prometheus TEST_DEBUG=true make e2e(开启DEBUG 选项可以查看传输数据明细)
TEST_DEBUG=true TEST_PROFILE=false ./scripts/e2e.sh behavior input_prometheus
=========================================
input_prometheus testing case
=========================================
load log config /home/liujiapeng.ljp/data/ilogtail/behavior-test/plugin_logger.xml
2022-01-20 10:46:46 [INF] [load.go:75] [load] load config from: /home/liujiapeng.ljp/data/ilogtail/test/case/behavior/input_prometheus/ilogtail-e2e.yaml
2022-01-20 10:46:46 [INF] [controller.go:129] [WithCancelChain] httpcase controller is initializing....:
2022-01-20 10:46:46 [INF] [controller.go:129] [WithCancelChain] ilogtail controller is initializing....:
2022-01-20 10:46:46 [INF] [validator_control.go:53] [Init] validator controller is initializing....:
2022-01-20 10:46:46 [DBG] [validator_control.go:57] [Init] stage:add rule:fields-check
2022-01-20 10:46:46 [DBG] [validator_control.go:65] [Init] stage:add rule:counter-check
2022-01-20 10:46:46 [INF] [controller.go:129] [WithCancelChain] subscriber controller is initializing....:
2022-01-20 10:46:46 [INF] [controller.go:129] [WithCancelChain] boot controller is initializing....:
2022-01-20 10:46:46 [INF] [boot_control.go:37] [Start] boot controller is starting....:
Creating network "ilogtail-e2e_default" with the default driver
Building exporter
Step 1/8 : FROM golang:1.16
---> 71f1b47263fc
Step 2/8 : WORKDIR /src
---> Using cache
---> d76e92450cbb
Step 3/8 : COPY exporter/* ./
---> Using cache
---> 55c76b7af4a1
Step 4/8 : RUN go env -w GO111MODULE=on
---> Using cache
---> 0bbd054e5ca3
Step 5/8 : RUN go env -w GOPROXY=https://goproxy.cn,direct
---> Using cache
---> 907a360df1d5
Step 6/8 : RUN go build
---> Using cache
---> 6ef458eccfc2
Step 7/8 : EXPOSE 18080
---> Using cache
---> ab9ad470b110
Step 8/8 : CMD ["/src/exporter"]
---> Using cache
---> 6eafc56b0059
Successfully built 6eafc56b0059
Successfully tagged ilogtail-e2e_exporter:latest
Creating ilogtail-e2e_goc_1 ... done
Creating ilogtail-e2e_exporter_1 ... done
Creating ilogtail-e2e_ilogtail_1 ... done
2022-01-20 10:46:50 [INF] [subscriber_control.go:51] [Start] subscriber controller is starting....:
2022-01-20 10:46:50 [INF] [validator_control.go:81] [Start] validator controller is starting....:
2022-01-20 10:46:50 [INF] [logtailplugin_control.go:63] [Start] ilogtail controller is starting....:
2022-01-20 10:46:50 [INF] [logtailplugin_control.go:70] [Start] the 1 times load config operation is starting ...
2022-01-20 10:46:50 [INF] [grpc.go:69] [func1] the grpc server would start in 0s
2022-01-20 10:46:50 [INF] [httpcase_control.go:67] [Start] httpcase controller is starting....:
2022-01-20 10:46:50 [INF] [controller.go:129] [WithCancelChain] testing has started and will last 15s
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"promhttp_metric_handler_requests_in_flight" > Contents:<Key:"__labels__" Value:"instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"1" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"promhttp_metric_handler_requests_total" > Contents:<Key:"__labels__" Value:"code#$#200|instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"0" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"promhttp_metric_handler_requests_total" > Contents:<Key:"__labels__" Value:"code#$#500|instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"0" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"promhttp_metric_handler_requests_total" > Contents:<Key:"__labels__" Value:"code#$#503|instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"0" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"test_counter" > Contents:<Key:"__labels__" Value:"instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"0" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"up" > Contents:<Key:"__labels__" Value:"instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"1" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"scrape_duration_seconds" > Contents:<Key:"__labels__" Value:"instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"0.002" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"scrape_samples_scraped" > Contents:<Key:"__labels__" Value:"instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"5" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"scrape_samples_post_metric_relabeling" > Contents:<Key:"__labels__" Value:"instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"5" >
2022-01-20 10:46:54 [DBG] [validator_control.go:107] [func2] Time:1642646811 Contents:<Key:"__name__" Value:"scrape_series_added" > Contents:<Key:"__labels__" Value:"instance#$#exporter:18080|job#$#prometheus" > Contents:<Key:"__time_nano__" Value:"1642646811274" > Contents:<Key:"__value__" Value:"5" >
2022-01-20 10:47:05 [INF] [httpcase_control.go:107] [func2] httpcase controller is closing....:
2022-01-20 10:47:05 [INF] [httpcase_control.go:108] [func2] httpcase controller is cleaning....:
2022-01-20 10:47:05 [INF] [logtailplugin_control.go:101] [func1] ilogtail controller is closing....:
2022-01-20 10:47:05 [INF] [logtailplugin_control.go:102] [func1] ilogtail controller is cleaning....:
2022-01-20 10:47:05 [INF] [logtailplugin_control.go:112] [func1] ilogtail controller would wait 5s to deal with the logs on the way
2022-01-20 10:47:10 [INF] [validator_control.go:89] [func1] validator controller is closing....:
2022-01-20 10:47:10 [INF] [validator_control.go:90] [func1] validator controller is cleaning....:
2022-01-20 10:47:10 [INF] [subscriber_control.go:54] [func1] subscriber controller is closing....:
2022-01-20 10:47:10 [INF] [subscriber_control.go:74] [Clean] subscriber controller is cleaning....:
2022-01-20 10:47:10 [INF] [boot_control.go:40] [func1] boot controller is stoping....:
2022-01-20 10:47:10 [INF] [boot_control.go:48] [Clean] boot controller is cleaning....:
Stopping ilogtail-e2e_ilogtail_1 ... done
Stopping ilogtail-e2e_exporter_1 ... done
Stopping ilogtail-e2e_goc_1 ... done
Removing ilogtail-e2e_ilogtail_1 ... done
Removing ilogtail-e2e_exporter_1 ... done
Removing ilogtail-e2e_goc_1 ... done
Removing network ilogtail-e2e_default
2022-01-20 10:47:11 [INF] [controller.go:112] [Start] Testing is completed:
2022-01-20 10:47:11 [INF] [controller.go:122] [Start] the E2E testing is passed:
v1.3.8
=========================================
All testing cases are passed
========================================
本地Node Exporter 采集实战
-
准备Linux 环境。
-
下载NodeExporter,下载地址: https://prometheus.io/download/#node_exporter ,并进行启动,启动后可以通过curl 127.0.0.1:9100/metrics 查看NodeExporter 的Metrics指标。
-
下载 最新的ilogtail版本进行安装。
# 解压tar包
$ tar zxvf logtail-linux64.tar.gz
# 查看目录结构
$ ll logtail-linux64
drwxr-xr-x 3 500 500 4096 bin
drwxr-xr-x 184 500 500 12288 conf
-rw-r--r-- 1 500 500 597 README
drwxr-xr-x 2 500 500 4096 resources
# 进入bin目录
$ cd logtail-linux64/bin
$ ll
-rwxr-xr-x 1 500 500 10052072 ilogtail_1.0.28 # ilogtail可执行文件
-rwxr-xr-x 1 500 500 4191 ilogtaild
-rwxr-xr-x 1 500 500 5976 libPluginAdapter.so
-rw-r--r-- 1 500 500 89560656 libPluginBase.so
-rwxr-xr-x 1 500 500 2333024 LogtailInsight
-
创建采集配置目录。
# 1. 创建sys_conf_dir
$ mkdir sys_conf_dir
# 2. 创建ilogtail_config.json并完成配置。
##### logtail_sys_conf_dir取值为:$pwd/sys_conf_dir/
##### config_server_address固定取值,保持不变。
$ pwd
/root/bin/logtail-linux64/bin
$ cat ilogtail_config.json
{
"logtail_sys_conf_dir": "/root/bin/logtail-linux64/bin/sys_conf_dir/",
"config_server_address" : "http://logtail.cn-zhangjiakou.log.aliyuncs.com"
}
# 3. 此时的目录结构
$ ll
-rwxr-xr-x 1 500 500 ilogtail_1.0.28
-rw-r--r-- 1 root root ilogtail_config.json
-rwxr-xr-x 1 500 500 ilogtaild
-rwxr-xr-x 1 500 500 libPluginAdapter.so
-rw-r--r-- 1 500 500 libPluginBase.so
-rwxr-xr-x 1 500 500 LogtailInsight
drwxr-xr-x 2 root root sys_conf_dir
-
设置采集配置文件,将下列内如写入sys_conf_dir/user_local_config.json文件,上述核心配置为plugin部分,配置说明我们启动了Prometheus 采集插件,采集端口为9100,并且我们将采集到的数据保存于node_exporter.log 文件。
{
"metrics":{
"##1.0##k8s-log-custom-test-project-helm-0":{
"aliuid":"1654218965343050",
"category":"container_stdout_logstore",
"create_time":1640692891,
"defaultEndpoint":"cn-beijing-b-intranet.log.aliyuncs.com",
"delay_alarm_bytes":0,
"enable":true,
"enable_tag":false,
"filter_keys":[
],
"filter_regs":[
],
"group_topic":"",
"local_storage":true,
"log_type":"plugin",
"log_tz":"",
"max_send_rate":-1,
"merge_type":"topic",
"plugin":{
"inputs":[
{
"detail":{
"Yaml":"global:\n scrape_interval: 15s\n evaluation_interval: 15s\nscrape_configs:\n - job_name: \"prometheus\"\n static_configs:\n - targets: [\"localhost:9100\"]"
},
"type":"service_prometheus"
}
],
"flushers":[
{
"detail":{
"FileName":"./node_exporter.log"
},
"type":"flusher_stdout"
}
]
},
"priority":0,
"project_name":"k8s-log-custom-test-project-helm",
"raw_log":false,
"region":"cn-beijing-b",
"send_rate_expire":0,
"sensitive_keys":[
],
"shard_hash_key":[
],
"tz_adjust":false,
"version":1
}
}
}
-
启动iLogtail,查看采集数据。
# 启动采集
$ ./ilogtail_1.0.28
$ ps -ef|grep logtail
root 48453 1 ./ilogtail_1.0.28
root 48454 48453 ./ilogtail_1.0.28
# 查看采集数据
tailf node_exporter.json
2022-01-20 11:38:52 {"__name__":"promhttp_metric_handler_errors_total","__labels__":"cause#$#gathering|instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"0","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"promhttp_metric_handler_requests_in_flight","__labels__":"instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"1","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"promhttp_metric_handler_requests_total","__labels__":"code#$#200|instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"1393","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"promhttp_metric_handler_requests_total","__labels__":"code#$#500|instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"0","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"promhttp_metric_handler_requests_total","__labels__":"code#$#503|instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"0","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"up","__labels__":"instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"1","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"scrape_duration_seconds","__labels__":"instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"0.011","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"scrape_samples_scraped","__labels__":"instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"1189","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"scrape_samples_post_metric_relabeling","__labels__":"instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"1189","__time__":"1642649932"}
2022-01-20 11:38:52 {"__name__":"scrape_series_added","__labels__":"instance#$#localhost:9100|job#$#prometheus","__time_nano__":"1642649932033","__value__":"0","__time__":"1642649932"}
日志服务NodeExporter 采集实战
iLogtail 采集Prometheus数据
-
主机下载NodeExporter,下载地址: https://prometheus.io/download/#node_exporter ,并进行启动,启动后可以通过curl 127.0.0.1:9100/metrics 查看NodeExporter 的Metrics指标。
-
创建日志服务MetricStore。
-
创建Prometheus采集配置。
-
查看采集数据
如下图所知,iLogtail 采集的NodeExporter 指标采用图表化展现,日志服务Metrics查询语言全面兼容PromQL,更多可视化用户请参考https://help.aliyun.com/document_detail/252810.html。
如计算5分钟内SysLoad:
使用Grafana 对接日志服务MetricStore
-
导入看板。
-
查看指标数据,如下图所示,Grafana 展示了iLogtail 采集的指标数据。
总结
iLogtail 提供了完整Prometheus 指标采集能力,无需改造Exporter 指标,即可完成Prometheus 指标的采集。而通过日志服务MetricStore的能力,用户也可以使用其作为Prometheus 替代选项,通过的Grafana 商店丰富的看板模板快速构建自己的监控大盘。
参考文档