1. 背景.
我打算在K8S集群部署一套Prometheus监控系统,以监控系统和各应用的各项指标,如资源、性能及自定义监控指标,具体部署方案和细节就不和大家详细说了,后面再和大家分享,这次先说我遇到问题。在Prometheus各组件都部署成功时候,我发现grafana的service的类型为 "ClusterIP",这意味着我无法在浏览器访问,于是我决定通过编辑grafana的yaml,将"ClusterIP" 改为 "NodePort" 类型,却发现在浏览器使用nodeip+端口方式还是访问不了。
2. 执行 "kubectl get all -n monitoring" 发现grafana为ClusterIP类型.
# kubectl get all -n monitoring NAME READY STATUS RESTARTS AGE pod/alertmanager-main-0 2/2 Running 0 4h47m pod/blackbox-exporter-5d668b5c6-f9fds 3/3 Running 0 5h17m pod/grafana-68fd49fd99-jhs25 1/1 Running 0 5h17m pod/kube-state-metrics-78ddfd78fd-8blqk 3/3 Running 0 5h17m pod/node-exporter-6lvps 2/2 Running 0 5h17m pod/node-exporter-8pw78 2/2 Running 0 5h17m pod/node-exporter-mmnbc 2/2 Running 0 5h17m pod/node-exporter-p49nq 2/2 Running 1 (5h13m ago) 5h17m pod/node-exporter-v7fvb 2/2 Running 0 5h17m pod/prometheus-adapter-5485575f49-8r4gd 1/1 Running 0 5h17m pod/prometheus-adapter-5485575f49-974m8 1/1 Running 0 5h17m pod/prometheus-k8s-0 2/2 Running 0 5h15m pod/prometheus-operator-5b687bfbb8-7djk2 2/2 Running 0 5h17m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/alertmanager-main ClusterIP 10.96.23.99 <none> 9093/TCP,8080/TCP 5h17m service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 5h15m service/blackbox-exporter ClusterIP 10.96.38.236 <none> 9115/TCP,19115/TCP 5h17m service/grafana ClusterIP 10.96.112.113 <none> 3000/TCP 5h17m service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 5h17m service/node-exporter ClusterIP None <none> 9100/TCP 5h17m service/prometheus-adapter ClusterIP 10.96.216.177 <none> 443/TCP 5h17m service/prometheus-k8s ClusterIP 10.96.113.84 <none> 9090/TCP,8080/TCP 5h17m service/prometheus-operated ClusterIP None <none> 9090/TCP 5h15m service/prometheus-operator ClusterIP None <none> 8443/TCP 5h17m NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-exporter 5 5 5 5 5 kubernetes.io/os=linux 5h17m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/blackbox-exporter 1/1 1 1 5h17m deployment.apps/grafana 1/1 1 1 5h17m deployment.apps/kube-state-metrics 1/1 1 1 5h17m deployment.apps/prometheus-adapter 2/2 2 2 5h17m deployment.apps/prometheus-operator 1/1 1 1 5h17m NAME DESIRED CURRENT READY AGE replicaset.apps/blackbox-exporter-5d668b5c6 1 1 1 5h17m replicaset.apps/grafana-68fd49fd99 1 1 1 5h17m replicaset.apps/kube-state-metrics-78ddfd78fd 1 1 1 5h17m replicaset.apps/prometheus-adapter-5485575f49 2 2 2 5h17m replicaset.apps/prometheus-operator-5b687bfbb8 1 1 1 5h17m NAME READY AGE statefulset.apps/alertmanager-main 1/1 5h15m statefulset.apps/prometheus-k8s 1/1 5h15m
3. 将 grafana的"ClusterIP" 改为 "NodePort" 类型.
# kubectl edit svc grafana -n monitoring service/grafana edited
# kubectl get svc grafana -n monitoring -oyaml apiVersion: v1 kind: Service metadata: creationTimestamp: "2024-04-10T01:16:27Z" labels: app.kubernetes.io/component: grafana app.kubernetes.io/name: grafana app.kubernetes.io/part-of: kube-prometheus app.kubernetes.io/version: 10.4.0 name: grafana namespace: monitoring resourceVersion: "804268" uid: c6b751b4-9710-4159-8051-0e73660577ca spec: clusterIP: 10.96.112.113 clusterIPs: - 10.96.112.113 externalTrafficPolicy: Cluster internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: http nodePort: 32440 port: 3000 protocol: TCP targetPort: http selector: app.kubernetes.io/component: grafana app.kubernetes.io/name: grafana app.kubernetes.io/part-of: kube-prometheus sessionAffinity: None type: NodePort ###将 type: NodePort 改为 clusterIP 即可. ###将 type: NodePort 改为 clusterIP 即可. ###将 type: NodePort 改为 clusterIP 即可. status: loadBalancer: {}
4. 执行 "kubectl get all -n monitoring" 发现grafana变为了NodePort类型.
# kubectl get all -n monitoring NAME READY STATUS RESTARTS AGE pod/alertmanager-main-0 2/2 Running 0 5h1m pod/blackbox-exporter-5d668b5c6-f9fds 3/3 Running 0 5h32m pod/grafana-68fd49fd99-jhs25 1/1 Running 0 5h32m pod/kube-state-metrics-78ddfd78fd-8blqk 3/3 Running 0 5h32m pod/node-exporter-6lvps 2/2 Running 0 5h31m pod/node-exporter-8pw78 2/2 Running 0 5h31m pod/node-exporter-mmnbc 2/2 Running 0 5h31m pod/node-exporter-p49nq 2/2 Running 1 (5h27m ago) 5h31m pod/node-exporter-v7fvb 2/2 Running 0 5h31m pod/prometheus-adapter-5485575f49-8r4gd 1/1 Running 0 5h31m pod/prometheus-adapter-5485575f49-974m8 1/1 Running 0 5h31m pod/prometheus-k8s-0 2/2 Running 0 5h29m pod/prometheus-operator-5b687bfbb8-7djk2 2/2 Running 0 5h31m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/alertmanager-main ClusterIP 10.96.23.99 <none> 9093/TCP,8080/TCP 5h32m service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 5h29m service/blackbox-exporter ClusterIP 10.96.38.236 <none> 9115/TCP,19115/TCP 5h32m service/grafana NodePort 10.96.112.113 <none> 3000:31735/TCP 5h32m service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 5h32m service/node-exporter ClusterIP None <none> 9100/TCP 5h31m service/prometheus-adapter ClusterIP 10.96.216.177 <none> 443/TCP 5h31m service/prometheus-k8s ClusterIP 10.96.113.84 <none> 9090/TCP,8080/TCP 5h31m service/prometheus-operated ClusterIP None <none> 9090/TCP 5h29m service/prometheus-operator ClusterIP None <none> 8443/TCP 5h31m NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-exporter 5 5 5 5 5 kubernetes.io/os=linux 5h32m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/blackbox-exporter 1/1 1 1 5h32m deployment.apps/grafana 1/1 1 1 5h32m deployment.apps/kube-state-metrics 1/1 1 1 5h32m deployment.apps/prometheus-adapter 2/2 2 2 5h31m deployment.apps/prometheus-operator 1/1 1 1 5h31m NAME DESIRED CURRENT READY AGE replicaset.apps/blackbox-exporter-5d668b5c6 1 1 1 5h32m replicaset.apps/grafana-68fd49fd99 1 1 1 5h32m replicaset.apps/kube-state-metrics-78ddfd78fd 1 1 1 5h32m replicaset.apps/prometheus-adapter-5485575f49 2 2 2 5h31m replicaset.apps/prometheus-operator-5b687bfbb8 1 1 1 5h31m NAME READY AGE statefulset.apps/alertmanager-main 1/1 5h29m statefulset.apps/prometheus-k8s 1/1 5h29m
4. 采用10.0.0.104+31735端口方式访问grafana发现访问失败.
我节点ip地址为 10.0.0.104,你需要使用你自己集群节点是ip+自己NodePort端口访问grafana服务。
我节点ip地址为 10.0.0.104,你需要使用你自己集群节点是ip+自己NodePort端口访问grafana服务。
我节点ip地址为 10.0.0.104,你需要使用你自己集群节点是ip+自己NodePort端口访问grafana服务。
5. 我开始在网络上查找资料,发现是网络限制原因.
解决方法是删除monitoring命名空间下的网络策略让其从新加载pod间网络,稍微等待一会哦,在浏览器就可以正常访问了。
kubectl delete networkpolicy --all -n monitoring
6. 在浏览器使用节点ip+端口访问测试,发现没有问题.