零、前言
当中秋节遇上教师节,那就是双倍的祝福和快乐~~键客小盒子祝大家中秋节圆满!教师节快乐!
一、概述
Prometheus 是一个开源的系统监控和报警系统,现在已经加入到 CNCF 基金会,成为继 k8s 之后第二个在 CNCF 托管的项目,在 kubernetes 容器管理系统中,通常会搭配 prometheus 进行监控,同时也支持多种 exporter 采集数据,还支持 pushgateway 进行数据上报,Prometheus 性能足够支撑上万台规模的集群。Grafana 是一个跨平台的开源的度量分析和可视化工具,可以将采集的数据可视化的展示,并及时通知给告警接收方。它主要有以下六大特点:
- 展示方式:快速灵活的客户端图表,面板插件有许多不同方式的可视化指标和日志,官方库中具有丰富的仪表盘插件,比如热图、折线图、图表等多种展示方式;
- 数据源:Graphite,InfluxDB,OpenTSDB,Prometheus,Elasticsearch,CloudWatch 和KairosDB 等;
- 通知提醒:以可视方式定义最重要指标的警报规则,Grafana 将不断计算并发送通知,在数据达到阈值时通过 Slack、PagerDuty 等获得通知;
- 混合展示:在同一图表中混合使用不同的数据源,可以基于每个查询指定数据源,甚至自定义数据源;
- 注释:使用来自不同数据源的丰富事件注释图表,将鼠标悬停在事件上会显示完整的事件元数据和标记。
node-exporter 可以采集机器(物理机、虚拟机、云主机等)的监控指标数据,能够采集到的指标包括 CPU, 内存,磁盘,网络,文件数等信息
cAdvisor对Node机器上的资源及容器进行实时监控和性能数据采集,包括CPU使用情况、内存使用情况、网络吞吐量及文件系统使用情况,cAdvisor集成在Kubelet中,当kubelet启动时会自动启动cAdvisor,即一个cAdvisor仅对一台Node机器进行监控。
二、监控逻辑图
三、部署说明
Prometheus的几种安装方式
- 二进制安装
- 容器安装(docker/k8s)
- Helm安装
- Prometheus Operator
- kube-Prometheus Stack
镜像准备
监控主机基础信息的镜像:docker pull prom/node-exporter:v1.3.1监控主机容器信息的镜像:docker pull zcube/cadvisor:v0.39.3收集主机信息的镜像:docker pull prom/prometheus:v2.33.5展示主机信息的镜像:docker pull grafana/grafana:8.4.3
镜像打包
docker save -o /docker_images/node-exporter.tar node-exporter:v1.3.1docker save -o /docker_images/cadvisor.tar cadvisor:v0.39.3docker save -o /docker_images/prometheus.tar prometheus:v2.33.5docker save -o /docker_images/grafana.tar grafana:8.4.3
四、准备相关文件及脚本
4.1 制作docker-compose-monitoring.yml文件
version: "3.7" services: node-exporter: image: prom/node-exporter:v1.3.1 container_name: bdyh-node-exporter restart: on-failure privileged: true deploy: resources: limits: memory: 1024M reservations: memory: 300M environment: TZ: Asia/Shanghai volumes: - /proc:/host/proc:ro - /sys:/host/sys:ro - /:/rootfs:ro ports: - "9100:9100" networks: - pk_net cadvisor: image: zcube/cadvisor:v0.39.3 container_name: bdyh-cadvisor restart: on-failure privileged: true deploy: resources: limits: memory: 1024M reservations: memory: 300M environment: TZ: Asia/Shanghai volumes: - /:/rootfs:ro - /var/run:/var/run:ro - /sys:/sys:ro - /var/lib/docker/:/var/lib/docker:ro - /dev/disk/:/dev/disk:ro - /cgroup:/cgroup:ro ports: - "9080:8080" networks: - pk_net prometheus: image: prom/prometheus:v2.33.5 container_name: bdyh-prometheus restart: on-failure privileged: true deploy: resources: limits: memory: 1024M reservations: memory: 300M environment: TZ: Asia/Shanghai volumes: - /data/monitoring/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml ports: - "9090:9090" depends_on: - node-exporter - cadvisor networks: - pk_net grafana: image: grafana/grafana:8.4.3 container_name: bdyh-grafana restart: on-failure privileged: true deploy: resources: limits: memory: 1024M reservations: memory: 300M environment: TZ: Asia/Shanghai volumes: - /data/monitoring/grafana/grafana-storage:/var/lib/grafana #利用grafana的provisioning方式,通过配置方式添加datasource和dashboard,预置仪表盘和数据源 - /data/monitoring/grafana/provisioning:/etc/grafana/provisioning - /data/monitoring/grafana/json:/tmp/dashboards ports: - "3000:3000" depends_on: - prometheus networks: - pk_net networks: pk_net: external: true
注:利用grafana的provisioning方式,通过配置方式添加datasource和dashboard,预置仪表盘和数据源。
4.2 配置文件及json文件
prometheus.yml配置文件内容如下:
global: scrape_interval: 60s evaluation_interval: 60s scrape_configs: - job_name: prometheus static_configs: - targets: ['bdyh-prometheus:9090'] labels: instance: prometheus - job_name: linux static_configs: - targets: ['bdyh-node-exporter:9100'] - job_name: docker static_configs: - targets: ['bdyh-cadvisor:8080']
Docker主机监控.json 、Liunx主机监控.json从官网下载即可,需要调整里面的uid,自定义一个名称,此名称要和datasources.yaml文件中保持一样。如下示例:
... "datasource": { "type": "prometheus", "uid": "bdyh-prometheus-9090" } ...
datasources下用于存放数据源的配置文件(可以配置多个),如下datasources.yaml所示:
# # config file version apiVersion: 1 # # list of datasources that should be deleted from the database deleteDatasources: - name: Prometheus orgId: 1 # list of datasources to insert/update depending # on what's available in the database datasources: # <string, required> name of the datasource. Required - name: Prometheus # <string, required> datasource type. Required type: prometheus # <string, required> access mode. direct or proxy. Required access: proxy # <int> org id. will default to orgId 1 if not specified orgId: 1 # <string> 自定义UID,可以用于在配置的其他部分引用此数据源,如果没有指定,将自动生成 uid: bdyh-prometheus-9090 # <string> url url: http://bdyh-prometheus:9090 # <string> database password, if used password: # <string> database user, if used user: # <string> database name, if used database: # <bool> enable/disable basic auth basicAuth: false # <string> basic auth username basicAuthUser: '' # <string> basic auth password basicAuthPassword: '' # <bool> enable/disable with credentials headers withCredentials: false # <bool> mark as default datasource. Max one per org isDefault: false # <map> fields that will be converted to json and stored in json_data jsonData: graphiteVersion: "1.1" tlsAuth: false tlsAuthWithCACert: false httpHeaderName1: "Authorization" # <string> json object of data that will be encrypted. secureJsonData: tlsCACert: "..." tlsClientCert: "..." tlsClientKey: "..." # <openshift\kubernetes token example> httpHeaderValue1: "Bearer xf5yhfkpsnmgo" version: 1 # <bool> allow users to edit datasources from the UI. editable: true
dashboards.yaml文件如下所示:
apiVersion: 1 providers: - name: 'default' orgId: 1 folder: '' type: file updateIntervalSeconds: 10 options: path: /tmp/dashboards
注:/tmp/dashboards下放的是仪表盘的json文件。如:Docker主机监控.json和Liunx主机监控.json
4.3 一键部署脚本
过滤Docker是否存在此自定义网络脚本:install-network.sh
#!/bin/bash ############################################################### # 作者:键客小盒子 # 脚本名:install-network.sh # 时间:2022-09-10 # 功能描述:过滤Docker是否存在此自定义网络 ############################################################### echo -e '\n\n-----------------------执行install-network start-----------------------' echo "" #自定义一个网络名称变量 network_name="pk_net" filterName=`docker network ls | grep $network_name | awk '{ print $2 }'` if [ "$filterName" == "" ]; then echo "不存在pk_net,将创建一个自定义的网络pk_net,如下所示:" echo "" #不存在就创建一个自定义的网络pk_net,此处的10.139可以自定义,不冲突即可 sudo docker network create --driver bridge --subnet 10.139.0.0/16 --gateway 10.139.0.1 pk_net else echo "已存在网络pk_net" fi echo -e '\n\n-----------------------执行install-network end-----------------------'
一键部署主机及容器监控:install-monitoring.sh
#!/bin/bash ############################################################### # 作者:键客小盒子 # 脚本名:install-monitoring.sh # 时间:2022-09-10 # 功能描述:一键部署主机及容器监控(Prometheus+Grafana+cAdvisor+node-exporter) ############################################################### echo -e '\n\n-----------------------Docker install monitoring start-----------------------' cd `dirname $0` SH_PATH=`pwd` BASE_PATH=${SH_PATH%/*} echo "" echo "" echo "#########################################################" echo "# 导入监控镜像 -- 开始 #" echo "#########################################################" sudo docker load -i ./docker_images/node-exporter.tar sudo docker load -i ./docker_images/cadvisor.tar sudo docker load -i ./docker_images/prometheus.tar sudo docker load -i ./docker_images/grafana.tar echo "#########################################################" echo "# 导入监控镜像 -- 结束 #" echo "#########################################################" echo "" echo "" echo "#########################################################" echo "# 创建docker自定义网络 -- 开始 #" echo "#########################################################" #预先创建一个自定义的网络pk_net,此处的10.139可以自定义,不冲突即可 sudo chmod u+x *.sh sudo bash ./install-network.sh sleep 2s echo "#########################################################" echo "##创建docker自定义网络 -- 结束 #" echo "#########################################################" echo "" echo "" echo "#########################################################" echo "# 创建 monitoring目录 -- 开始 #" echo "#########################################################" sudo mkdir -p $BASE_PATH/monitoring/prometheus sudo mkdir -p $BASE_PATH/monitoring/grafana/grafana-storage sudo mkdir -p $BASE_PATH/monitoring/grafana/provisioning sudo mkdir -p $BASE_PATH/monitoring/grafana/json sudo mkdir -p $BASE_PATH/docker-compose-file sudo cp ./monitoring/prometheus/prometheus.yml $BASE_PATH/monitoring/prometheus/prometheus.yml sudo cp -r ./monitoring/grafana/provisioning/* $BASE_PATH/monitoring/grafana/provisioning/ sudo cp -r ./monitoring/grafana/json/* $BASE_PATH/monitoring/grafana/json/ sudo cp -r $BASE_PATH/pkulaw/docker/docker-compose-monitoring.yml $BASE_PATH/docker-compose-file sudo chmod -R 777 $BASE_PATH/monitoring sudo chmod -R 777 $BASE_PATH/monitoring/grafana/grafana-storage echo "#########################################################" echo "# 创建 monitoring目录 -- 结束 #" echo "#########################################################" echo "" echo "" echo "#########################################################" echo "# 根据实际路径调整docker-compose中的映射路径 -- 开始 #" echo "#########################################################" BASE_COMPOSE_URL1=$BASE_PATH/monitoring/prometheus BASE_COMPOSE_URL2=$BASE_PATH/monitoring/grafana sudo sed -i "s#/data/monitoring/prometheus#$BASE_COMPOSE_URL1#" $BASE_PATH/docker-compose-file/docker-compose-monitoring.yml sudo sed -i "s#/data/monitoring/grafana#$BASE_COMPOSE_URL2#" $BASE_PATH/docker-compose-file/docker-compose-monitoring.yml echo "#########################################################" echo "# 根据实际路径调整docker-compose中的映射路径-- 结束 #" echo "" echo "" echo "#########################################################" echo "# 安装 monitoring -- 开始 #" echo "#########################################################" docker-compose -p monitoring --compatibility -f $BASE_PATH/docker-compose-file/docker-compose-monitoring.yml up -d echo "#########################################################" echo "# 安装 monitoring -- 结束 #" echo "#########################################################" echo "" echo "" echo "#########################################################" echo "# monitoring开放端口 -- 开始 #" echo "#########################################################" sudo firewall-cmd --permanent --add-port=9100/tcp sudo firewall-cmd --permanent --add-port=9080/tcp sudo firewall-cmd --permanent --add-port=9090/tcp sudo firewall-cmd --permanent --add-port=3000/tcp sudo firewall-cmd --reload echo "#########################################################" echo "# monitoring开放端口 -- 结束 #" echo "#########################################################" echo -e '\n\n-----------------------Docker install monitoring end-----------------------'
完整目录结构如下:
注:上面grafana-storage文件是grafana的原有配置,保持不动挂载进去即可。
4.4 一键部署
$ chmod +x install-monitoring.sh $ ./install-monitoring.sh