搭建Prometheus和NodeExporter服务器监控并用Grafana展示-开发者社区-阿里云

Grafana完整教程

2025-07-12 1629

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 本文介绍了Grafana与Prometheus的安装与配置流程，涵盖源配置、端口设置、服务端与客户端安装、Node Exporter部署及自启动设置，同时提供多服务器监控方案与推荐Dashboard。

安装grafana前配置

调整源
默认端口为3000
服务端安装grafana (参考官网)

客户端

安装客户端
 阿里云ARMS监控服务文档
 设置Grafana Youtuber Blog - jhooq

安装prometheus:

wget https://github.com/prometheus/prometheus/releases/download/v2.47.0-rc.0/prometheus-2.47.0-rc.0.linux-amd64.tar.gz

安装node exporter

wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz

解压node exporter

tar xvfz node_exporter-*.tar.gz
cd node_exporter-*.tar.gz

我们一般将其移动或放置在/usr/local/bin下（或者为了方便，放到/usr/local 或者/opt也可以）。

运行并测试node exporter

./node_exporter
#最后一行输出
>ts=2023-09-03T08:27:20.532Z caller=tls_config.go:277 level=info msg="TLS is disabled." http2=false address=[::]:9100

查看localhost:9100/metrics 是否有如下显示（也可以使用curl看）

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
...

解压prometheus

tar xvfz prometheus-*.tar.gz
cd prometheus-*.tar.gz

我们一般将其移动或放置在/usr/local/bin下（或者为了方便，放到/usr/local 或者/opt也可以）。

运行并测试prometheus

# run  或者加./prometheus --config.file=xxx.yml 指定yml (一般放置/etc/prometheus下)
./prometheus


#最后一行输出不固定的，例如
>ts=2023-09-03T08:54:49.797Z caller=manager.go:1009 level=info component="rule manager" msg="Starting rule manager..."

默认端口9090，浏览器查看localhost:9090

添加prometheus设置 (node exporter)

现在prometheus已经可以正常使用，但是需要额外添加node exporter到yml配置文件中：

- job_name: 'node_exporter'
    scrape_interval: 5s
    static_configs:
        - targets: ['localhost:9100']

也就是

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node_exporter"
    scrape_interval: 5s
    static_configs:
        - targets: ["localhost:9100"]

自启动

Prometheus：

创建systemd服务 /etc/systemd/system/prometheus.service

默认tsdb保存在文件夹下的data内，这里不做修改。
用户和用户组如果不存在需要先创建。(创建一个nologin的用户)

[Unit]
  Description=Prometheus Monitoring
  Documentation=https://prometheus.io/docs/introduction/overview/
  Wants=network-online.target
  After=network-online.target

[Service]
  User=prometheus
  Group=prometheus
  Type=simple
  ExecStart=/usr/local/bin/prometheus/prometheus \
  --config.file /usr/local/bin/prometheus/prometheus.yml \
  --storage.tsdb.path /usr/local/bin/prometheus/data/ \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries
  ExecReload=/bin/kill -HUP $MAINPID

[Install]
  WantedBy=multi-user.target

重启

sudo systemctl daemon-reload
sudo systemctl start prometheus.service
sudo systemctl enable prometheus.service
sudo systemctl status prometheus.service
# view log
sudo journalctl -u prometheus.service -f

Node Exporter：

[Unit]
  Description=Node Exporter
  Wants=network-online.target
  After=network-online.target

[Service]
  User=nodeexporter
  Type=simple
  ExecStart=/usr/local/bin/node_exporter/node_exporter 

[Install]
  WantedBy=multi-user.target

监控另一台服务器

只需在prometheus.yml中target加上逗号添加即可。

    static_configs:
      - targets: ["localhost:9090", "192.168.200.102:9090"]

  - job_name: "node_exporter"
    scrape_interval: 5s
    static_configs:
        - targets: ["localhost:9100","192.168.200.102:9100"]

效果图

因为第一个监控的服务器是localhost看不出ip, 建议专门拿一台出来做监控机器。

Grafana Dashboard

Grafana Dashboard地址：https://grafana.com/grafana/dashboards/

好用的Dashboard:
作者：StarsL.cn
https://grafana.com/grafana/dashboards/16098
https://grafana.com/grafana/dashboards/8919

外国人：
https://grafana.com/grafana/dashboards/1860

Grafana完整教程

安装grafana前配置

客户端

解压node exporter

运行并测试node exporter

解压prometheus

运行并测试prometheus

添加prometheus设置 (node exporter)

自启动

Prometheus：

Node Exporter：

监控另一台服务器

Grafana Dashboard

热门文章

最新文章

相关课程

相关电子书

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

Grafana完整教程

安装grafana前配置

客户端

解压node exporter

运行并测试node exporter

解压prometheus

运行并测试prometheus

添加prometheus设置 (node exporter)

自启动

Prometheus：

Node Exporter：

监控另一台服务器

Grafana Dashboard

热门文章

最新文章

相关课程

相关电子书

推荐镜像