实战Prometheus-elasticsearch_exporter

本文涉及的产品
可观测监控 Prometheus 版,每月50GB免费额度
Elasticsearch Serverless通用抵扣包,测试体验金 200元
简介: Prometheus

9、 elasticsearch_exporter 安装9.1 官方推荐
https://github.com/prometheus-community/elasticsearch_exporter/releases/download/v1.2.1/elasticsearch_exporter-1.2.1.linux-amd64.tar.gz

yum  -y install golang
GOPATH=/usr/local go get -u github.com/justwatchcom/elasticsearch_exporter


cat << EOF > /etc/systemd/system/elasticsearch_exporter.service
[Unit]
Description=Prometheus elasticsearch_exporter
After=local-fs.target network-online.target network.target
Wants=local-fs.target network-online.target network.target

[Service]
User=root
Nice=10
ExecStart = /usr/local/bin/elasticsearch_exporter --es.uri=http://x.x.x.x:9200  --es.all --es.indices --es.timeout 20s
ExecStop= /usr/bin/killall elasticsearch_exporter

[Install]
WantedBy=default.target
EOF

systemctl daemon-reload
systemctl enable elasticsearch_exporter.service
systemctl start  elasticsearch_exporter.service

# prometheus 配置
  - job_name: elasticsearch
    scrape_interval: 60s
    scrape_timeout:  30s
    metrics_path: "/metrics"
    static_configs:
    - targets:
      - elastic2.test.lan:9108
      - elastic-log2.prod.lan:9108
      labels:
        service: elasticsearch
    relabel_configs:
    - source_labels: [__address__]
      regex: '(.*)\:9108'
      target_label:  'instance'
      replacement:   '$1'
    - source_labels: [__address__]
      regex:         '.*\.(.*)\.lan.*'
      target_label:  'environment'
      replacement:   '$1'
      
## config for prometheus alerts.rules
ALERT Elastic_UP
  IF elasticsearch_up{job="elasticsearch"} != 1
  FOR 120s
  LABELS { severity="alert", value = "{{$value}}" }
  ANNOTATIONS {
    summary = "Instance {{ $labels.instance }}: Elasticsearch instance status is not 1",
    description = "This server's Elasticsearch instance status has a value of {{ $value }}.",
  }

ALERT Elastic_Cluster_Health_RED
  IF elasticsearch_cluster_health_status{color="red"}==1
  FOR 300s
  LABELS { severity="alert", value = "{{$value}}" }
  ANNOTATIONS {
    summary = "Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}",
    description = "Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}.",
  }

ALERT Elastic_Cluster_Health_Yellow
  IF elasticsearch_cluster_health_status{color="yellow"}==1
  FOR 300s
  LABELS { severity="alert", value = "{{$value}}" }
  ANNOTATIONS {
    summary = "Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}",
    description = "Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}.",
  }

ALERT Elasticsearch_JVM_Heap_Too_High
 IF elasticsearch_jvm_memory_used_bytes{area="heap"} / elasticsearch_jvm_memory_max_bytes{area="heap"} > 0.8
 FOR 15m
 LABELS { severity="alert", value = "{{$value}}" }
 ANNOTATIONS {
    summary = "ElasticSearch node {{ $labels.instance }} heap usage is high",
    description = "The heap in {{ $labels.instance }} is over 80% for 15m.",
  }

ALERT Elasticsearch_health_up
 IF elasticsearch_cluster_health_up !=1
 FOR 1m
 LABELS { severity="alert", value = "{{$value}}" }
 ANNOTATIONS {
    summary = "ElasticSearch node: {{ $labels.instance }} last scrape of the ElasticSearch cluster health failed",
    description = "ElasticSearch node: {{ $labels.instance }} last scrape of the ElasticSearch cluster health failed",
  }

ALERT Elasticsearch_Too_Few_Nodes_Running
  IF elasticsearch_cluster_health_number_of_nodes < 3
  FOR 5m
  LABELS { severity="alert", value = "{{$value}}" }
  ANNOTATIONS {
    description="There are only {{$value}} < 3 ElasticSearch nodes running",
    summary="ElasticSearch running on less than 3 nodes"
  }

ALERT Elasticsearch_Count_of_JVM_GC_Runs
 IF rate(elasticsearch_jvm_gc_collection_seconds_count{}[5m])>5
 FOR 60s
 LABELS { severity="warning", value = "{{$value}}" }
 ANNOTATIONS {
    summary = "ElasticSearch node {{ $labels.instance }}: Count of JVM GC runs > 5 per sec and has a value of {{ $value }}",
    description = "ElasticSearch node {{ $labels.instance }}: Count of JVM GC runs > 5 per sec and has a value of {{ $value }}",
  }

ALERT Elasticsearch_GC_Run_Time
 IF rate(elasticsearch_jvm_gc_collection_seconds_sum[5m])>0.3
 FOR 60s
 LABELS { severity="warning", value = "{{$value}}" }
 ANNOTATIONS {
    summary = "ElasticSearch node {{ $labels.instance }}: GC run time in seconds > 0.3 sec and has a value of {{ $value }}",
    description = "ElasticSearch node {{ $labels.instance }}: GC run time in seconds > 0.3 sec and has a value of {{ $value }}",
  }

ALERT Elasticsearch_json_parse_failures
 IF elasticsearch_cluster_health_json_parse_failures>0
 FOR 60s
 LABELS { severity="warning", value = "{{$value}}" }
 ANNOTATIONS {
    summary = "ElasticSearch node {{ $labels.instance }}: json parse failures > 0 and has a value of {{ $value }}",
    description = "ElasticSearch node {{ $labels.instance }}: json parse failures > 0 and has a value of {{ $value }}",
  }


ALERT Elasticsearch_breakers_tripped
 IF rate(elasticsearch_breakers_tripped{}[5m])>0
 FOR 60s
 LABELS { severity="warning", value = "{{$value}}" }
 ANNOTATIONS {
    summary = "ElasticSearch node {{ $labels.instance }}: breakers tripped > 0 and has a value of {{ $value }}",
    description = "ElasticSearch node {{ $labels.instance }}: breakers tripped > 0 and has a value of {{ $value }}",
  }

ALERT Elasticsearch_health_timed_out
 IF elasticsearch_cluster_health_timed_out>0
 FOR 60s
 LABELS { severity="warning", value = "{{$value}}" }
 ANNOTATIONS {
    summary = "ElasticSearch node {{ $labels.instance }}: Number of cluster health checks timed out > 0 and has a value of {{ $value }}",
    description = "ElasticSearch node {{ $labels.instance }}: Number of cluster health checks timed out > 0 and has a value of {{ $value }}",
  }

9.2 测试通过consule 注册中心

tar xf elasticsearch_exporter-1.1.0.linux-amd64.tar.gz -C /usr/local/
cd elasticsearch_exporter-1.1.0.linux-amd64/
nohup ./elasticsearch_exporter --es.uri http://x.x.x.x:9200 --es.all --es.indices --es.cluster_settings --es.indices_settings --es.shards --es.snapshots --es.timeout 10s &
vim es.json
{
  "ID": "es-instance-x.x.x.x",
  "Name": "es-instance-x.x.x.x",
  "Tags": [
    "es_instance"
  ],
  "Address": "x.x.x.x",
  "Port": 9114,
  "Meta": {
    "instance": "es-instance-x.x.x.x",
    "role": "test-it-es-cluster-prod"
  },
  "EnableTagOverride": false,
  "Check": {
    "HTTP": "http://x.x.x.x:9114/metrics",
    "Interval": "10s"
  },
  "Weights": {
    "Passing": 10,
    "Warning": 1
  }
}

curl -X PUT --data @es.json http://x.x.x.x:8500/v1/agent/service/register
# 模板 2322
https://grafana.com/grafana/dashboards/2322
相关实践学习
以电商场景为例搭建AI语义搜索应用
本实验旨在通过阿里云Elasticsearch结合阿里云搜索开发工作台AI模型服务,构建一个高效、精准的语义搜索系统,模拟电商场景,深入理解AI搜索技术原理并掌握其实现过程。
ElasticSearch 最新快速入门教程
本课程由千锋教育提供。全文搜索的需求非常大。而开源的解决办法Elasricsearch(Elastic)就是一个非常好的工具。目前是全文搜索引擎的首选。本系列教程由浅入深讲解了在CentOS7系统下如何搭建ElasticSearch,如何使用Kibana实现各种方式的搜索并详细分析了搜索的原理,最后讲解了在Java应用中如何集成ElasticSearch并实现搜索。 &nbsp;
相关文章
|
12天前
|
缓存 监控 前端开发
顺企网 API 开发实战:搜索 / 详情接口从 0 到 1 落地(附 Elasticsearch 优化 + 错误速查)
企业API开发常陷参数、缓存、错误处理三大坑?本指南拆解顺企网双接口全流程,涵盖搜索优化、签名验证、限流应对,附可复用代码与错误速查表,助你2小时高效搞定开发,提升响应速度与稳定性。
|
12月前
|
Prometheus 运维 监控
智能运维实战:Prometheus与Grafana的监控与告警体系
【10月更文挑战第26天】Prometheus与Grafana是智能运维中的强大组合,前者是开源的系统监控和警报工具,后者是数据可视化平台。Prometheus具备时间序列数据库、多维数据模型、PromQL查询语言等特性,而Grafana支持多数据源、丰富的可视化选项和告警功能。两者结合可实现实时监控、灵活告警和高度定制化的仪表板,广泛应用于服务器、应用和数据库的监控。
1060 3
|
存储 运维 监控
超越传统模型:从零开始构建高效的日志分析平台——基于Elasticsearch的实战指南
【10月更文挑战第8天】随着互联网应用和微服务架构的普及,系统产生的日志数据量日益增长。有效地收集、存储、检索和分析这些日志对于监控系统健康状态、快速定位问题以及优化性能至关重要。Elasticsearch 作为一种分布式的搜索和分析引擎,以其强大的全文检索能力和实时数据分析能力成为日志处理的理想选择。
779 6
|
7月前
|
人工智能 自然语言处理 运维
让搜索引擎“更懂你”:AI × Elasticsearch MCP Server 开源实战
本文介绍基于Model Context Protocol (MCP)标准的Elasticsearch MCP Server,它为AI助手(如Claude、Cursor等)提供与Elasticsearch数据源交互的能力。文章涵盖MCP概念、Elasticsearch MCP Server的功能特性及实际应用场景,例如数据探索、开发辅助。通过自然语言处理,用户无需掌握复杂查询语法即可操作Elasticsearch,显著降低使用门槛并提升效率。项目开源地址:&lt;https://github.com/awesimon/elasticsearch-mcp&gt;,欢迎体验与反馈。
1689 1
|
8月前
|
Prometheus 运维 监控
运维实战来了!如何构建适用于YashanDB的Prometheus Exporter
今天分享的是构建YashanDB Exporter的核心设计理念和关键方法,希望也能为你的运维实战加分!
|
12月前
|
Prometheus 运维 监控
智能运维实战:Prometheus与Grafana的监控与告警体系
【10月更文挑战第27天】在智能运维中,Prometheus和Grafana的组合已成为监控和告警体系的事实标准。Prometheus负责数据收集和存储,支持灵活的查询语言PromQL;Grafana提供数据的可视化展示和告警功能。本文介绍如何配置Prometheus监控目标、Grafana数据源及告警规则,帮助运维团队实时监控系统状态,确保稳定性和可靠性。
992 0
|
存储 数据采集 数据处理
数据处理神器Elasticsearch_Pipeline:原理、配置与实战指南
数据处理神器Elasticsearch_Pipeline:原理、配置与实战指南
596 12
|
人工智能 自然语言处理 开发者
Langchain 与 Elasticsearch:创新数据检索的融合实战
Langchain 与 Elasticsearch:创新数据检索的融合实战
|
存储 缓存 监控
干货 | Elasticsearch 8.X 性能优化实战
干货 | Elasticsearch 8.X 性能优化实战
|
存储 机器学习/深度学习 API
高维向量搜索:在 Elasticsearch 8.X 中利用 dense_vector 的实战探索
高维向量搜索:在 Elasticsearch 8.X 中利用 dense_vector 的实战探索
高维向量搜索:在 Elasticsearch 8.X 中利用 dense_vector 的实战探索

热门文章

最新文章