ACK Net-Exporter 是以守护进程的方式部署在集群中,相关的指标采集后可以上报至Prometheus 进行处理并在Grafana上展示出来。ACK Net-Exporter支持 指标上报至第三方Prometheus,或者上报至阿里云ARMS,自动化一键展示。
1) 自建Prometheus + Grafana
Kubernetes 可以根据annotation来获取应用自主暴露监控指标的服务。应用添加annotations后,Prometheus可以根据annotation实现抓取。例如:
prometheus.io/scrape: 'true' 获知对应的endpoint是需要被scrape的 prometheus.io/app-metrics: 'true' 获知对应的endpoint中有应用进程暴露的metrics prometheus.io/app-metrics-port: '8080' 获知进程暴露的metrics的端口 prometheus.io/app-metrics-path: '/metrics' 获知进程暴露的metrics的具体路径
部署node-exporter,node-exporter用于采集服务器层面的运行指标,包括机器的filesystem、meminfo等基础监控,类似于传统主机监控维度的zabbix-agent。它负责从目标Jobs收集数据,并把收集到的数据转换为Prometheus支持的时序数据格式。和传统的指标数据收集组件不同的是,它只负责收集,并不向Server端发送数据,而是等待Prometheus Server 主动抓取。
apiVersion: apps/v1 kind: DaemonSet metadata: name: node-exporter namespace: prometheus labels: k8s-app: node-exporter spec: selector: matchLabels: k8s-app: node-exporter template: metadata: labels: k8s-app: node-exporter spec: containers: - image: prom/node-exporter name: node-exporter ports: - containerPort: 9100 protocol: TCP name: http --- apiVersion: v1 kind: Service metadata: labels: k8s-app: node-exporter name: node-exporter namespace: prometheus spec: ports: - name: http port: 9100 nodePort: 31672 protocol: TCP type: ClusterIP selector: k8s-app: node-exporter
除了 node-exporter 之外,还有 metrics-server之类的用于采集相关的节点的资源使用信息,这里不做多阐述。
自建Prometheus——部署Prometheus
通过下面的yaml 部署 prometheus的serverPod, 并通过svc 暴露服务。
apiVersion: apps/v1 kind: Deployment metadata: name: prometheus namespace: prometheus labels: app: prometheus spec: selector: matchLabels: app: prometheus template: metadata: labels: app: prometheus spec: securityContext: #指定运行的用户为root runAsUser: 0 serviceAccountName: prometheus containers: - image: prom/prometheus name: prometheus args: - "--config.file=/etc/prometheus/prometheus.yml" #通过volume挂载prometheus.yml - "--storage.tsdb.path=/prometheus" #通过vlolume挂载目录/prometheus - "--storage.tsdb.retention.time=24h" - "--web.enable-admin-api" #控制对admin HTTP API的访问,其中包括删除时间序列等功能 - "--web.enable-lifecycle" #支持热更新,直接执行localhost:9090/-/reload立即生效 ports: - containerPort: 9090 name: http volumeMounts: - mountPath: "/etc/prometheus" name: config-volume - mountPath: "/prometheus" name: data resources: requests: cpu: 100m memory: 512Mi limits: cpu: 100m memory: 512Mi volumes: - name: data persistentVolumeClaim: claimName: prometheus-data #本地存储 - name: config-volume configMap: name: prometheus-config #定义的prometeus.yaml --- kind: Service apiVersion: v1 metadata: labels: app: prometheus name: prometheus namespace: prometheus spec: type: LoadBalancer ports: - port: 9090 targetPort: 9090 nodePort: 30003 selector: app: prometheus
设置Prometheus的RBAC 相关权限。
apiVersion: v1 kind: ServiceAccount metadata: name: prometheus namespace: prometheus --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus rules: - apiGroups: - "" resources: - nodes - services - endpoints -Pods - nodes/proxy verbs: - get - list - watch - apiGroups: - "extensions" resources: - ingresses verbs: - get - list - watch - apiGroups: - "" resources: - configmaps - nodes/metrics verbs: - get - nonResourceURLs: - /metrics verbs: - get --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: prometheus roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus subjects: - kind: ServiceAccount name: prometheus namespace: prometheus
配置相关的configmap,使prometheus可以服务发现 net-exporter的指标接口,并自动获取抓取的指标
apiVersion: v1 kind: ConfigMap metadata: name: prometheus-config namespace: prometheus data: prometheus.yml: | global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: 'net-exporter' kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_endpoints_name] regex: 'net-exporter' action: keep - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role:Pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name
通过前文所知的prometheus的svc 对外暴露地址,登录9090端口,可以看到相关的'inspector_xxx'的指标检索,说明prometheus已经成功的抓取到了相关的net-exporter对外暴露的指标。这里肯定有小伙伴疑问,这里只介绍了自建prometheus的安装,怎么实现net-exporter指标采集呢?
这里埋了一个小彩蛋,其实是需要在集群内先部署好net-exporter组件,这个我们会在下个小节介绍。
更多精彩内容,欢迎观看:
《云原生网络数据面可观测性最佳实践》——四、ACK Net-Exporter 快速上手——1.Prometheus + Grafana配置(下):https://developer.aliyun.com/article/1221320?groupCode=supportservice