本章中将介绍K8S使用者必须考虑的重量级问题:日志与监控。
课程中会分析当下主流的日志处理方案并选择一种方案进行日志从采集到展示的完整实践;
会讲解主流k8s监控方案prometheus,包括它的实现原理,支持的各种指标等。
常见日志采集问题和解决方案分析
传统服务VSk8s中的服务
传统服务VSk8s中的服务
k8s的日志
k8s的日志
k8s的日志处理方案
远端存储
远端可以选择kafka,ES,后面再做统一的处理,比如汇总到某个日志服务器
优点: 简单,使用于Docker和非Docker场景
缺点:应用需要改造,比如写入本地的应用,需要改成调用远端存储
sidecar
在每个podz中跑一个sidecar,sidecar会和主容器共享Volume,他可以访问到所有的日志文件,然后把日志文件转发到后端存储。
优点:简单,对应用没有侵入
缺点:侵入pod,消耗内存和CPU;社区不推荐
LogAgent
在每个Node节点上部署一个Agent, 相当于sidecar方案,把sidecar从pod放在了node上。
以Daemonset方式运行,采集docker对应的json.log目录。
对于写日志文件的服务,需要服务,把容器中的日志挂载到宿主机上。同时约定好挂载的目录,LogAgent采集这个目录
优点:每个节点部署一个Agent,资源消耗小,入侵性小,对pod和应用都没有侵入
缺点:
1. 约定所有程序挂载到一个特定主机目录,文件后缀名尽量统一,否则维护困难;
2. 挂载目录是预先约定好,导致没法判断日志来自于那个pod,需要定期清理残留的日志文件。
实践方案
Fluentd,FileBeat:静态的,需要事先配置好。而Docker容器是动态的,容易变化。
LogPilot : 采集标准输出、错误输出,文件输出;实现静态工具的动态配置,支持Fluentd,FileBea插件
LogPilot 日志采集搭建
ES配置
--- apiVersion: v1 kind: Service metadata: name: elasticsearch-api namespace: kube-system labels: name: elasticsearch spec: selector: app: es ports: - name: transport # 用于ES节点外部通信(Http) port: 9200 protocol: TCP --- apiVersion: v1 kind: Service metadata: name: elasticsearch-discovery namespace: kube-system labels: name: elasticsearch spec: selector: app: es ports: - name: transport # 用于ES节点内部通信(TCP) port: 9300 protocol: TCP --- apiVersion: apps/v1beta1 kind: StatefulSet metadata: name: elasticsearch namespace: kube-system labels: kubernetes.io/cluster-service: "true" spec: # 最小两个实例,保证高可用 replicas: 3 serviceName: "elasticsearch-service" selector: matchLabels: app: es template: metadata: labels: app: es spec: # 让容器可以运行在主节点上,如果通过二进制方式安装,主节点没有kubelet则是不行 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master serviceAccountName: dashboard-admin initContainers: - name: init-sysctl image: busybox:1.27 command: - sysctl - -w - vm.max_map_count=262144 securityContext: privileged: true containers: - name: elasticsearch image: registry.cn-hangzhou.aliyuncs.com/imooc/elasticsearch:5.5.1 ports: - containerPort: 9200 protocol: TCP - containerPort: 9300 protocol: TCP securityContext: capabilities: add: - IPC_LOCK - SYS_RESOURCE resources: limits: memory: 4000Mi requests: cpu: 100m memory: 2000Mi env: - name: "http.host" value: "0.0.0.0" - name: "network.host" value: "_eth0_" - name: "cluster.name" value: "docker-cluster" - name: "bootstrap.memory_lock" value: "false" - name: "discovery.zen.ping.unicast.hosts" value: "elasticsearch-discovery" - name: "discovery.zen.ping.unicast.hosts.resolve_timeout" value: "10s" - name: "discovery.zen.ping_timeout" value: "6s" - name: "discovery.zen.minimum_master_nodes" value: "2" - name: "discovery.zen.fd.ping_interval" value: "2s" - name: "discovery.zen.no_master_block" value: "write" - name: "gateway.expected_nodes" value: "2" - name: "gateway.expected_master_nodes" value: "1" - name: "transport.tcp.connect_timeout" value: "60s" - name: "ES_JAVA_OPTS" value: "-Xms2g -Xmx2g" livenessProbe: tcpSocket: port: transport initialDelaySeconds: 20 periodSeconds: 10 volumeMounts: - name: es-data mountPath: /data terminationGracePeriodSeconds: 30 volumes: - name: es-data hostPath: path: /es-data # 创建 # kubectl apply -f elasticsearch.yaml # 查看状态 # kubectl get svc -n kube-system # kubectl get statefulset -n kube-system
log-pilot配置
--- apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: log-pilot namespace: kube-system labels: k8s-app: log-pilot kubernetes.io/cluster-service: "true" spec: template: metadata: labels: k8s-app: log-es kubernetes.io/cluster-service: "true" version: v1.22 spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule serviceAccountName: dashboard-admin containers: - name: log-pilot # 通过动态修改filebeat日志文件实现日志采集 image: registry.cn-hangzhou.aliyuncs.com/imooc/log-pilot:0.9-filebeat resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi env: - name: "FILEBEAT_OUTPUT" value: "elasticsearch" - name: "ELASTICSEARCH_HOST" value: "elasticsearch-api" - name: "ELASTICSEARCH_PORT" value: "9200" - name: "ELASTICSEARCH_USER" value: "elastic" - name: "ELASTICSEARCH_PASSWORD" value: "changeme" volumeMounts: - name: sock # 访问宿主机docker mountPath: /var/run/docker.sock - name: root mountPath: /host readOnly: true - name: varlib mountPath: /var/lib/filebeat - name: varlog mountPath: /var/log/filebeat securityContext: capabilities: add: - SYS_ADMIN terminationGracePeriodSeconds: 30 volumes: - name: sock hostPath: path: /var/run/docker.sock - name: root hostPath: path: / - name: varlib hostPath: path: /var/lib/filebeat type: DirectoryOrCreate - name: varlog hostPath: path: /var/log/filebeat type: DirectoryOrCreate # 创建 # kubectl apply -f log-pilot.yaml # 查看状态 # kubectl get ds -n kube-system
kibana配置
--- #Service apiVersion: v1 kind: Service metadata: name: kibana namespace: kube-system labels: component: kibana spec: selector: component: kibana ports: - name: http port: 80 targetPort: http --- #ingress apiVersion: extensions/v1beta1 kind: Ingress metadata: name: kibana namespace: kube-system spec: rules: # 记得host文件添加此域名 - host: kibana.mooc.com http: paths: - path: / backend: serviceName: kibana servicePort: 80 --- apiVersion: apps/v1beta1 kind: Deployment metadata: name: kibana namespace: kube-system labels: component: kibana spec: replicas: 1 selector: matchLabels: component: kibana template: metadata: labels: component: kibana spec: containers: - name: kibana image: registry.cn-hangzhou.aliyuncs.com/imooc/kibana:5.5.1 env: - name: CLUSTER_NAME value: docker-cluster - name: ELASTICSEARCH_URL value: http://elasticsearch-api:9200/ resources: limits: cpu: 1000m requests: cpu: 100m ports: - containerPort: 5601 name: http # 创建 # kubectl apply -f kibana.yaml # 查看状态 # kubectl get deploy -n kube-system
应用Demo配置
#deploy apiVersion: apps/v1 kind: Deployment metadata: name: web-demo spec: selector: matchLabels: app: web-demo replicas: 3 template: metadata: labels: app: web-demo spec: containers: - name: web-demo image: hub.mooc.com/kubernetes/web:v1 ports: - containerPort: 8080 env: # 使用log-pilot,这个名字需要aliyun_logs_开头,如果对接ES表示索引,对接kafka表示Topic - name: aliyun_logs_catalina value: "stdout" - name: aliyun_logs_access value: "/usr/local/tomcat/logs/*" volumeMounts: - mountPath: /usr/local/tomcat/logs name: accesslogs volumes: # emptyDir,表示Docker自动生成的位置 - name: accesslogs emptyDir: {} --- #service apiVersion: v1 kind: Service metadata: name: web-demo spec: ports: - port: 80 protocol: TCP targetPort: 8080 selector: app: web-demo type: ClusterIP --- #ingress apiVersion: extensions/v1beta1 kind: Ingress metadata: name: web-demo spec: rules: - host: web.mooc.com http: paths: - path: / backend: serviceName: web-demo servicePort: 80 # 创建 # kubectl apply -f web.yaml # 查看状态 # kubectl get pods # kubectl get pods -o wide # 查看log-pilot日志 #docker ps | grep log-pilot #docker logs -f 容器ID
监控
监控目的
- 及时发现已经出现的问题
- 提前预警可能发生的问题
监控什么
- 系统基础指标
- 服务基础信息
- 服务个性化信息(QPS、处理时间等)
- 日志(异常发生频率等)
如何监控
- 数据采集()
- 数据存储(时间序列数据库)
- 定义报警规则
- 配置报警方式(邮件、短信等)
业内常见的监控组件/方案
- Zabbix
- 小米OpenFalcon
- 听云、监控宝
Kubernetes的监控
- 每个节点的基础指标
- 每个容器的基础指标
- Kubernetes集群组件
课程分享: