K8s + prometheus + vm(VictoriaMetrics)

本文涉及的产品
可观测可视化 Grafana 版,10个用户账号 1个月
可观测监控 Prometheus 版,每月50GB免费额度
简介: K8s + prometheus + vm(VictoriaMetrics)

K8s + prometheus + vm(VictoriaMetrics)

Prometheus 是目前常用的监控软件,今天主要介绍下pronethes在k8s中的部署以及使用vm作为存储的使用

安装 prometheus

  • 创建 ns
kubectl create ns prometheus
  • 创建 node-exporter
#cat exporter.yaml 
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  labels:
    app: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      hostPID: true
      hostIPC: true
      hostNetwork: true
      nodeSelector:
        kubernetes.io/os: linux
      containers:
      - name: node-exporter
        image: prom/node-exporter:v1.1.1
        args:
        - --web.listen-address=$(HOSTIP):9100
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        - --path.rootfs=/host/root
        - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
        - --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
        ports:
        - containerPort: 9100
        env:
        - name: HOSTIP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        resources:
          requests:
            cpu: 150m
            memory: 180Mi
          limits:
            cpu: 150m
            memory: 180Mi
        securityContext:
          runAsNonRoot: true
          runAsUser: 65534
        volumeMounts:
        - name: proc
          mountPath: /host/proc
        - name: sys
          mountPath: /host/sys
        - name: root
          mountPath: /host/root
          mountPropagation: HostToContainer
          readOnly: true
      tolerations:
      - operator: "Exists"
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: dev
        hostPath:
          path: /dev
      - name: sys
        hostPath:
          path: /sys
      - name: root
        hostPath:
          path: / 


# kubectl -n prometheus apply -f exporter.yaml

prromethes的exproter配置我们放到comfig中管理

  • 创建 promethesu ConfigMap
# cat prometheus-cm.yaml 

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yaml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
    scrape_configs:
    - job_name: "nodes"
      static_configs:
      - targets: ['XXX:9100', 'XXX:9100', 'XXX:9100', 'XXX:9100'] #替换成自己的node ip
      relabel_configs: # 通过 relabeling 从 __address__ 中提取 IP 信息,为了后面验证 VM 是否兼容 relabeling
      - source_labels: [__address__]
        regex: "(.*):(.*)"
        replacement: "${1}"
        target_label: 'ip'
        action: replace 

# kubectl -n prometheus apply -f prometheus-cm.yaml
  • 创建 pvc
# cat prometheus-pvc.yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-data
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  storageClassName: managed-nfs-storage

# kubectl -n prometheus apply -f prometheus-pvc.yaml
  • 创建 service account
# cat sa.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - services
  - endpoints
  - pods
  - nodes/proxy
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - "extensions"
  resources:
    - ingresses
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: prometheus # 注意换成自己的ns

# kubectl -n prometheus apply -f sa.yaml
  • 部署 prometheus
# cat prometheus.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
spec:
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: prometheus-data
        - name: config-volume
          configMap:
            name: prometheus-config
      containers:
        - image: prom/prometheus:v2.35.0
          name: prometheus
          args:
            - "--config.file=/etc/prometheus/prometheus.yaml"
            - "--storage.tsdb.path=/prometheus" # 指定tsdb数据路径
            - "--storage.tsdb.retention.time=24h"
            - "--web.enable-admin-api"  # 控制对admin HTTP API的访问,其中包括删除时间序列等功能
            - "--web.enable-lifecycle" # 支持热更新,直接执行localhost:9090/-/reload立即生效
          ports:
            - containerPort: 9090
              name: http
          securityContext:
            runAsUser: 0
          volumeMounts:
            - mountPath: "/etc/prometheus"
              name: config-volume
            - mountPath: "/prometheus"
              name: data
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
spec:
  selector:
    app: prometheus
  type: ClusterIP
  ports:
    - name: web
      port: 9090
      targetPort: http

# kubectl -n prometheus apply -f prometheus.yaml
  • 创建 ingress (如果没有可以忽略)
# cat ingress.yaml 
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus
  annotations:
    kubernetes.io/ingress.class: "nginx"
    kubernetes.io/ingress.rule-mix: "true"
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
    - hosts:
        - prometheus.xxx.cn  # 指定需要证书的域名
      secretName: prometheus.xxx.cn
  rules:
  - host: prometheus.xxx.cn
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus
            port:
              number: 9090

# kubectl -n prometheus apply -f ingress.yaml
  • 部署 grafana
# vi grafana.yaml
kind: Deployment
metadata:
  name: grafana
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      volumes:
        - name: storage
          persistentVolumeClaim:
            claimName: grafana-data
      containers:
        - name: grafana
          image: grafana/grafana:main
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 3000
              name: grafana
          securityContext:
            runAsUser: 0
          env:
            - name: GF_SECURITY_ADMIN_USER
              value: admin
            - name: GF_SECURITY_ADMIN_PASSWORD
              value: admin321
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: storage
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
spec:
  type: ClusterIP
  ports:
    - port: 3000
  selector:
    app: grafana
"grafana.yaml" [New] 57L, 1146C written
[root@k8s-master1 grafana]# cat grafana.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      volumes:
        - name: storage
          persistentVolumeClaim:
            claimName: grafana-data
      containers:
        - name: grafana
          image: grafana/grafana:main
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 3000
              name: grafana
          securityContext:
            runAsUser: 0
          env:
            - name: GF_SECURITY_ADMIN_USER
              value: admin
            - name: GF_SECURITY_ADMIN_PASSWORD
              value: admin321
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: storage
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
spec:
  type: ClusterIP
  ports:
    - port: 3000
  selector:
    app: grafana
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-data
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: managed-nfs-storage

# 
kubectl -n prometheus apply -f grafana.yaml
  • 创建 grafana ingress
# cat ingress.yaml 
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: grafana
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  tls:
    - hosts:
        - grafana.wy1212.cn  # 指定需要证书的域名
      secretName: grafana.wy1212.cn
  rules:
  - host: grafana.wy1212.cn
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: grafana
            port:
              number: 3000

# kubectl -n prometheus apply -f ingress.yaml

安装 VM 并配置

  • 创建 VM
# cat vm.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: victoria-metrics
spec:
  selector:
    matchLabels:
      app: victoria-metrics
  template:
    metadata:
      labels:
        app: victoria-metrics
    spec:
      volumes:
        - name: storage
          persistentVolumeClaim:
            claimName: victoria-metrics-data
      containers:
        - name: vm
          image: victoriametrics/victoria-metrics:v1.76.1
          imagePullPolicy: IfNotPresent
          args:
            - -storageDataPath=/var/lib/victoria-metrics-data
            - -retentionPeriod=1d #数据保存时间
          ports:
            - containerPort: 8428
              name: http
          volumeMounts:
            - mountPath: /var/lib/victoria-metrics-data
              name: storage
---
apiVersion: v1
kind: Service
metadata:
  name: victoria-metrics
spec:
  type: ClusterIP
  ports:
    - port: 8428
  selector:
    app: victoria-metrics
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: victoria-metrics-data
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  storageClassName: managed-nfs-storage #换成自己的nfs

# kubectl -n prometheus apply -f vm.yaml

等vm 运行起来后,修改 prometheus 的 configmap 修改数据存储位置

  • 修改 prometheus cm
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yaml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
    remote_write:    # 新增 远程写入到远程 VM 存储 
    - url: http://victoria-metrics:8428/api/v1/write # 新增
    scrape_configs:

因为我们部署prometheus的时候配置了 --web.enable-lifecycle 直接 curl -X POST localhost:9090/-/reload 即可生效

  • 在grafana 配置vm作为数据源测试

image.png

  • 创建完数据源,倒入一个dashboard (16098)测试

image.png

整个部署就完成了,关于其他如pod资源监控以及自动发现等配置,在下一篇文章中介绍。
image.png

相关实践学习
通过Ingress进行灰度发布
本场景您将运行一个简单的应用,部署一个新的应用用于新的发布,并通过Ingress能力实现灰度发布。
容器应用与集群管理
欢迎来到《容器应用与集群管理》课程,本课程是“云原生容器Clouder认证“系列中的第二阶段。课程将向您介绍与容器集群相关的概念和技术,这些概念和技术可以帮助您了解阿里云容器服务ACK/ACK Serverless的使用。同时,本课程也会向您介绍可以采取的工具、方法和可操作步骤,以帮助您了解如何基于容器服务ACK Serverless构建和管理企业级应用。 学习完本课程后,您将能够: 掌握容器集群、容器编排的基本概念 掌握Kubernetes的基础概念及核心思想 掌握阿里云容器服务ACK/ACK Serverless概念及使用方法 基于容器服务ACK Serverless搭建和管理企业级网站应用
相关文章
|
7月前
|
Prometheus 监控 Kubernetes
如何用 Prometheus Operator 监控 K8s 集群外服务?
如何用 Prometheus Operator 监控 K8s 集群外服务?
|
4月前
|
Prometheus Kubernetes 监控
Prometheus 与 Kubernetes 的集成
【8月更文第29天】随着容器化应用的普及,Kubernetes 成为了管理这些应用的首选平台。为了有效地监控 Kubernetes 集群及其上的应用,Prometheus 提供了一个强大的监控解决方案。本文将详细介绍如何在 Kubernetes 集群中部署和配置 Prometheus,以便对容器化应用进行有效的监控。
213 1
|
2月前
|
Prometheus Kubernetes 监控
k8s部署针对外部服务器的prometheus服务
通过上述步骤,您不仅成功地在Kubernetes集群内部署了Prometheus,还实现了对集群外服务器的有效监控。理解并实施网络配置是关键,确保监控数据的准确无误传输。随着监控需求的增长,您还可以进一步探索Prometheus生态中的其他组件,如Alertmanager、Grafana等,以构建完整的监控与报警体系。
135 60
|
2月前
|
Prometheus Kubernetes 监控
k8s部署针对外部服务器的prometheus服务
通过上述步骤,您不仅成功地在Kubernetes集群内部署了Prometheus,还实现了对集群外服务器的有效监控。理解并实施网络配置是关键,确保监控数据的准确无误传输。随着监控需求的增长,您还可以进一步探索Prometheus生态中的其他组件,如Alertmanager、Grafana等,以构建完整的监控与报警体系。
269 62
|
4月前
|
Prometheus Kubernetes 监控
Kubernetes(K8S) 监控 Prometheus + Grafana
Kubernetes(K8S) 监控 Prometheus + Grafana
310 2
|
4月前
|
Prometheus Kubernetes Cloud Native
使用prometheus来避免Kubernetes CPU Limits造成的事故
使用prometheus来避免Kubernetes CPU Limits造成的事故
108 7
|
5月前
|
Kubernetes Cloud Native 持续交付
云原生架构的核心组成部分通常包括容器化(如Docker)、容器编排(如Kubernetes)、微服务架构、服务网格、持续集成/持续部署(CI/CD)、自动化运维(如Prometheus监控和Grafana可视化)等。
云原生架构的核心组成部分通常包括容器化(如Docker)、容器编排(如Kubernetes)、微服务架构、服务网格、持续集成/持续部署(CI/CD)、自动化运维(如Prometheus监控和Grafana可视化)等。
|
6月前
|
Prometheus 监控 Kubernetes
深入理解Prometheus: Kubernetes环境中的监控实践
Kubernetes简介 在深入Prometheus与Kubernetes的集成之前,首先简要回顾一下Kubernetes的核心概念。Kubernetes是一个开源的容器编排平台,用于自动化容器的部署、扩展和管理。它提供了高度的可扩展性和灵活性,使得它成为微服务和云原生应用的理想选择。 核心组件 • 控制平面(Control Plane):集群管理相关的组件,如API服务器、调度器等。 • 工作节点(Nodes):运行应用容器的机器。 • Pods:Kubernetes的基本运行单位,可以容纳一个或多个容器。
|
7月前
|
Prometheus Kubernetes 监控
|
1月前
|
Prometheus 运维 监控
智能运维实战:Prometheus与Grafana的监控与告警体系
【10月更文挑战第26天】Prometheus与Grafana是智能运维中的强大组合,前者是开源的系统监控和警报工具,后者是数据可视化平台。Prometheus具备时间序列数据库、多维数据模型、PromQL查询语言等特性,而Grafana支持多数据源、丰富的可视化选项和告警功能。两者结合可实现实时监控、灵活告警和高度定制化的仪表板,广泛应用于服务器、应用和数据库的监控。
239 3