首先确保服务开启
vim node_rules.yml
注意:编写这个文件注意不要用tab键,只用空格来缩进
访问localhost:9090/rules
如果relod发现rules没有生效,可以重启服务
netstate -lntp |grep prom
kill -9 进程号
./prometheus &
再次访问
cpu > 80
100-(avg(irate(node_cup_seconds_total{mode='idle'}[5m]))by(instance)*100) > 80
内存
100 - (node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100
disk
100 - (((node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_free_bytes{fstype=~"xfs|ext4"}) / node_filesystem_size_bytes{fstype=~"xfs|ext4"}) * 100)
节点状态
up metric
监视特定节点状态的另一个有用指标:up ,如果实例是健康的,度量就被设置为1 ,失败返回 - 或 0
用来监控节点是否健康,如果健康则为1,不健康的话说明该服务器node服务可能停了,也可能该节点down了需要立马检查
- alert: NodeDown
expr: node_up == 0
for: 0m
labels:
severity: serious
annotations:
summary: "NodeDown"
下面都一样的模板配置即可
MysqlDown
RedisDown
NginxDown
JavaDown
groups:
- name: Hoststate-alert()
rules:
- alert: RedisDown
expr: up == 0
for: 0m
labels:
status: critical
annotations:
summary: "Redisdown"
description: "Redis instance is down"
- alert: MysqlDown
expr: up == 0
for: 0m
labels:
status: critical
annotations:
summary: "Msqldown"
description: "Mysql instance is down"
- alert: NginxDown
expr: up == 0
for: 0m
labels:
status: critical
annotations:
summary: "Nginxdown"
description: "Nginx instance is down"
- alert: NodeDown
expr: up == 0
for: 0m
labels:
status: critical
annotations:
summary: "Nodedown"
description: "Node instance is down"
- alert: JavaDown
expr: up == 0
for: 0m
labels:
status: critical
annotations:
summary: "Javadown"
description: "Java instance is down"
- alert: CPUusage
expr: 100-(avg(irate(node_cpu_seconds_total{mode='idle'}[5m]))by(instance) * 100) > 80
for: 5m
labels:
status: critical
annotations:
summary: "{{$labels.mountpoint}} CPU usage high"
description: "{{$labels.mountpoint}} CPU usage above 80% ( current usage:{{$value}})"
- alert: Memoryusage
expr: 100 - (node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes)/ node_memory_MemTotal_bytes * 100 > 80
for: 5m
labels:
status: critical
annotations:
summary: " Memory usage high"
description: "Memory usage above 80%.( current usage:{{$value}})"
- alert: Diskusage
expr: 100 - (((node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_free_bytes{fstype=~"xfs|ext4"}) / node_filesystem_size_bytes{fstype=~"xfs|ext4"}) * 100) > 80
for: 5m
labels:
status: critical
annotations:
summary: "Disk usage high"
description: "Disk usage above 80% ( current usage:{{$value}})"