[root@node4 yaml]# k get deployments.apps -n monitor-sa NAME READY UP-TO-DATE AVAILABLE AGE prometheus-server 0/2 0 0 2m5s [root@node4 yaml]# k get po -n monitor-sa NAME READY STATUS RESTARTS AGE node-exporter-6ttbl 1/1 Running 0 23h node-exporter-7ls5t 1/1 Running 0 23h node-exporter-r287q 1/1 Running 0 23h node-exporter-z85dm 1/1 Running 0 23h
注意注意,有一个sa的引用哦 serviceAccountName: monitor
[root@node4 yaml]# cat prometheus-deploy.yaml --- apiVersion: apps/v1 kind: Deployment metadata: name: prometheus-server namespace: monitor-sa labels: app: prometheus spec: replicas: 2 selector: matchLabels: app: prometheus component: server #matchExpressions: #- {key: app, operator: In, values: [prometheus]} #- {key: component, operator: In, values: [server]} template: metadata: labels: app: prometheus component: server annotations: prometheus.io/scrape: 'false' spec: nodeName: node4 serviceAccountName: monitor containers: - name: prometheus image: prom/prometheus:v2.2.1 imagePullPolicy: IfNotPresent command: - prometheus - --config.file=/etc/prometheus/prometheus.yml - --storage.tsdb.path=/prometheus - --storage.tsdb.retention=720h ports: - containerPort: 9090 protocol: TCP volumeMounts: - mountPath: /etc/prometheus/prometheus.yml name: prometheus-config subPath: prometheus.yml - mountPath: /prometheus/ name: prometheus-storage-volume volumes: - name: prometheus-config configMap: name: prometheus-config items: - key: prometheus.yml path: prometheus.yml mode: 0644 - name: prometheus-storage-volume hostPath: path: /data type: Directory
具体的命令是 kubectl edit deployment -n 命名空间 deployment名称,在本例中是这样的:
。。。。。。略略略 path: prometheus.yml name: prometheus-config name: prometheus-config - hostPath: path: /data type: Directory name: prometheus-storage-volume status: conditions: - lastTransitionTime: "2023-11-22T15:21:06Z" lastUpdateTime: "2023-11-22T15:21:06Z" message: Deployment does not have minimum availability. reason: MinimumReplicasUnavailable status: "False" type: Available - lastTransitionTime: "2023-11-22T15:21:06Z" lastUpdateTime: "2023-11-22T15:21:06Z" message: 'pods "prometheus-server-78bbb77dd7-" is forbidden: error looking up service account monitor-sa/monitor: serviceaccount "monitor" not found' reason: FailedCreate status: "True" type: ReplicaFailure - lastTransitionTime: "2023-11-22T15:31:07Z" lastUpdateTime: "2023-11-22T15:31:07Z" message: ReplicaSet "prometheus-server-78bbb77dd7" has timed out progressing. reason: ProgressDeadlineExceeded status: "False" type: Progressing observedGeneration: 1 unavailableReplicas: 2
可以看到有三个message,第一个是标题里提到的报错信息,在dashboard里这个信息会优先显示,如果是报错的时候,第二个message是进一步解释错误问题在哪,本例里是说有个名叫 monitor的sa没有找到,第三个信息说的是这个deployment控制的rs部署失败,此信息无关紧要了,那么,重要的是第二个信息,这个信息是解决问题的关键。
附:一个正常的deployment 的status:
这个status告诉我们,他是一个副本,部署成功的,因此,第一个message是Deployment has minimum availability
serviceAccount: kube-state-metrics serviceAccountName: kube-state-metrics terminationGracePeriodSeconds: 30 status: availableReplicas: 1 conditions: - lastTransitionTime: "2023-11-21T14:56:14Z" lastUpdateTime: "2023-11-21T14:56:14Z" message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: "True" type: Available - lastTransitionTime: "2023-11-21T14:56:13Z" lastUpdateTime: "2023-11-21T14:56:14Z" message: ReplicaSet "kube-state-metrics-57794dcf65" has successfully progressed. reason: NewReplicaSetAvailable status: "True" type: Progressing observedGeneration: 1 readyReplicas: 1 replicas: 1 updatedReplicas: 1
根据以上报错信息,那么,我们就需要一个sa,当然了,如果不想给太高的权限,就需要自己编写权限文件了,这里我偷懒 使用cluster-admin,具体的命令如下:
[root@node4 yaml]# k create sa -n monitor-sa monitor serviceaccount/monitor created [root@node4 yaml]# k create clusterrolebinding monitor-clusterrolebinding -n monitor-sa --clusterrole=cluster-admin --serviceaccount=monitor-sa:monitor
[root@node4 yaml]# k get po -n monitor-sa -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES node-exporter-6ttbl 1/1 Running 0 24h node2 <none> <none> node-exporter-7ls5t 1/1 Running 0 24h node1 <none> <none> node-exporter-r287q 1/1 Running 1 (2m57s ago) 24h node4 <none> <none> node-exporter-z85dm 1/1 Running 0 24h node3 <none> <none> prometheus-server-78bbb77dd7-6smlt 1/1 Running 0 20s node4 <none> <none> prometheus-server-78bbb77dd7-fhf5k 1/1 Running 0 20s node4 <none> <none>
[root@node4 yaml]# cat quota-nginx.yaml apiVersion: v1 kind: ResourceQuota metadata: name: quota namespace: default spec: hard: requests.cpu: "5.5" limits.cpu: "5.5" requests.memory: 10Gi limits.memory: 10Gi pods: "4" services: "20"
[root@node4 yaml]# cat nginx.yaml apiVersion: apps/v1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "1" creationTimestamp: "2023-11-22T16:13:33Z" generation: 1 labels: app: nginx name: nginx namespace: default resourceVersion: "16411" uid: e9a5cdc5-c6f0-45fb-a001-fcdd695eb925 spec: progressDeadlineSeconds: 600 replicas: 6 revisionHistoryLimit: 10 selector: matchLabels: app: nginx strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: nginx spec: containers: - image: nginx:1.18 imagePullPolicy: IfNotPresent name: nginx resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File resources: limits: cpu: 1 memory: 1Gi requests: cpu: 500m memory: 512Mi dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30
[root@node4 yaml]# k get po NAME READY STATUS RESTARTS AGE nginx-54f9858f64-g65pk 1/1 Running 0 4m50s nginx-54f9858f64-h42vf 1/1 Running 0 4m50s nginx-54f9858f64-s776t 1/1 Running 0 4m50s nginx-54f9858f64-wl7wz 1/1 Running 0 4m50s
[root@node4 yaml]# k get deployments.apps nginx -oyaml |grep message message: Deployment does not have minimum availability. message: 'pods "nginx-54f9858f64-p8rxf" is forbidden: exceeded quota: quota, requested: message: ReplicaSet "nginx-54f9858f64" is progressing.