环境准备
系统Centos 7.9
k8s集群:
版本:1.21.5
节点:
192.168.10.201 master
192.168.10.202 work
k8s集群监控方案:
k8s部署prometheus
文件准备:
prometheus-rbac.ymal : 创建角色,权限配置:
apiVersionv1 kindServiceAccount metadata nameprometheus namespacekube-system labels kubernetes.io/cluster-service"true" addonmanager.kubernetes.io/modeReconcile ---apiVersionrbac.authorization.k8s.io/v1 kindClusterRole metadata nameprometheus labels kubernetes.io/cluster-service"true" addonmanager.kubernetes.io/modeReconcile rulesapiGroups"" resourcesnodes nodes/metrics services endpoints pods verbsget list watch apiGroups"" resourcesconfigmaps verbsget nonResourceURLs"/metrics" verbsget ---apiVersionrbac.authorization.k8s.io/v1 kindClusterRoleBinding metadata nameprometheus labels kubernetes.io/cluster-service"true" addonmanager.kubernetes.io/modeReconcile roleRef apiGrouprbac.authorization.k8s.io kindClusterRole nameprometheus subjectskindServiceAccount nameprometheus namespacekube-system
prometheus-configmap.yaml
prometheus 监控服务动态发现配置
监控服务包括:
kubernetes-apiservers
kubernetes-nodes-kubelet
kubernetes-nodes-cadvisor
kubernetes-service-endpoints
kubernetes-services
kubernetes-pods
apiVersionv1 kindConfigMap metadata nameprometheus-config namespacekube-system labels kubernetes.io/cluster-service"true" addonmanager.kubernetes.io/modeEnsureExists data prometheus.yml scrape_configs: - job_name: prometheus static_configs: - targets: - localhost:9090 - job_name: kubernetes-apiservers kubernetes_sd_configs: - role: endpoints relabel_configs: - action: keep regex: default;kubernetes;https source_labels: - __meta_kubernetes_namespace - __meta_kubernetes_service_name - __meta_kubernetes_endpoint_port_name scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenjob_namekubernetes-nodes-kubelet kubernetes_sd_configsrolenode relabel_configsactionlabelmap regex__meta_kubernetes_node_label_(.+) schemehttps tls_config ca_file/var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verifytrue bearer_token_file/var/run/secrets/kubernetes.io/serviceaccount/token job_namekubernetes-nodes-cadvisor kubernetes_sd_configsrolenode relabel_configsactionlabelmap regex__meta_kubernetes_node_label_(.+) target_label__metrics_path__ replacement/metrics/cadvisor schemehttps tls_config ca_file/var/run/secrets/kubernetes.io/serviceaccount/ca.crt insecure_skip_verifytrue bearer_token_file/var/run/secrets/kubernetes.io/serviceaccount/token job_namekubernetes-service-endpoints kubernetes_sd_configsroleendpoints relabel_configsactionkeep regextrue source_labels__meta_kubernetes_service_annotation_prometheus_io_scrape actionreplace regex(https?) source_labels__meta_kubernetes_service_annotation_prometheus_io_scheme target_label__scheme__ actionreplace regex(.+) source_labels__meta_kubernetes_service_annotation_prometheus_io_path target_label__metrics_path__ actionreplace regex(^+)(?\d+)?;(\d+) replacement$1$2 source_labels__address__ __meta_kubernetes_service_annotation_prometheus_io_port target_label__address__ actionlabelmap regex__meta_kubernetes_service_label_(.+) actionreplace source_labels__meta_kubernetes_namespace target_labelkubernetes_namespace actionreplace source_labels__meta_kubernetes_service_name target_labelkubernetes_name job_namekubernetes-services kubernetes_sd_configsroleservice metrics_path/probe params modulehttp_2xx relabel_configsactionkeep regextrue source_labels__meta_kubernetes_service_annotation_prometheus_io_probe source_labels__address__ target_label__param_target replacementblackbox target_label__address__ source_labels__param_target target_labelinstance actionlabelmap regex__meta_kubernetes_service_label_(.+) source_labels__meta_kubernetes_namespace target_labelkubernetes_namespace source_labels__meta_kubernetes_service_name target_labelkubernetes_name job_namekubernetes-pods kubernetes_sd_configsrolepod relabel_configsactionkeep regextrue source_labels__meta_kubernetes_pod_annotation_prometheus_io_scrape actionreplace regex(.+) source_labels__meta_kubernetes_pod_annotation_prometheus_io_path target_label__metrics_path__ actionreplace regex(^+)(?\d+)?;(\d+) replacement$1$2 source_labels__address__ __meta_kubernetes_pod_annotation_prometheus_io_port target_label__address__ actionlabelmap regex__meta_kubernetes_pod_label_(.+) actionreplace source_labels__meta_kubernetes_namespace target_labelkubernetes_namespace actionreplace source_labels__meta_kubernetes_pod_name target_labelkubernetes_pod_name
prometheus-statefulset.yaml
创建有状态的服务
apiVersionapps/v1 kindStatefulSet metadata nameprometheus namespacekube-system labels k8s-appprometheus kubernetes.io/cluster-service"true" addonmanager.kubernetes.io/modeReconcile versionlatest spec serviceName"prometheus" replicas1 podManagementPolicy"Parallel" updateStrategy type"RollingUpdate" selector matchLabels k8s-appprometheus template metadata labels k8s-appprometheus annotations scheduler.alpha.kubernetes.io/critical-pod'' spec priorityClassNamesystem-cluster-critical serviceAccountNameprometheus initContainersname"init-chown-data" image"busybox:latest" imagePullPolicy"IfNotPresent" command"chown""-R""65534:65534""/data" volumeMountsnameprometheus-data mountPath/data subPath"" containersnameprometheus-server-configmap-reload image"jimmidyson/configmap-reload:latest" imagePullPolicy"IfNotPresent" args--volume-dir=/etc/config --webhook-url=http://localhost:9090/-/reload volumeMountsnameconfig-volume mountPath/etc/config readOnlytrue resources limits cpu10m memory10Mi requests cpu10m memory10Mi nameprometheus-server image"prom/prometheus:latest" imagePullPolicy"IfNotPresent" args--config.file=/etc/config/prometheus.yml --storage.tsdb.path=/data --web.console.libraries=/etc/prometheus/console_libraries --web.console.templates=/etc/prometheus/consoles --web.enable-lifecycle portscontainerPort9090 readinessProbe httpGet path/-/ready port9090 initialDelaySeconds30 timeoutSeconds30 livenessProbe httpGet path/-/healthy port9090 initialDelaySeconds30 timeoutSeconds30 resources limits cpu200m memory1000Mi requests cpu200m memory1000Mi volumeMountsnameconfig-volume mountPath/etc/config nameprometheus-data mountPath/data subPath"" terminationGracePeriodSeconds300 volumesnameconfig-volume configMap nameprometheus-config volumeClaimTemplatesmetadata nameprometheus-data spec storageClassNamenfs-csi accessModesReadWriteOnce resources requests storage"16Gi"
注意: prometheus的data,采用pvc,动态存储。
需要自行创建 pv 和pvc 。这里的pvc 为 nfs-csi
prometheus-service.yaml
kindService apiVersionv1 metadata nameprometheus namespacekube-system labels kubernetes.io/name"Prometheus" kubernetes.io/cluster-service"true" addonmanager.kubernetes.io/modeReconcile spec typeNodePort portsnamehttp port9090 protocolTCP targetPort9090 nodePort30090 selector k8s-appprometheus
注意:指定服务类型为NodePort
创建prometheus:
kubectlapply-f.
访问prometheus 控制台
动态服务发现:








