前言:
在系列的第一篇文章中,我已经介绍过如何在阿里云基于kubeasz搭建K8S集群,通过在K8S上部署gitlab并暴露至集群外来演示服务部署与发现的流程。文章写于4月,忙碌了小半年后,我才有时间把后续部分补齐。系列会分为三篇,本篇将继续部署基础设施,如jenkins、harbor、efk等,以便为第三篇项目实战做好准备。
需要说明的是,阿里云迭代的实在是太快了,2018年4月的时候,由于SLB不支持HTTP跳转HTTPS,迫不得已使用了Ingress-Nginx来做跳转控制。但在4月底的时候,SLB已经在部分地区如华北、国外节点支持HTTP跳转HTTPS。到了5月更是全节点支持。这样以来,又简化了Ingress-Nginx的配置。

一、Jenkins
一般情况下,我们搭建一个Jenkins用于持续集成,那么所有的Jobs都会在这一个Jenkins上进行build,如果Jobs数量较多,势必会引起Jenkins资源不足导致各种问题出现。于是,对于项目较多的部门、公司使用Jenkins,需要搭建Jenkins集群,也就是增加Jenkins Slave来协同工作。
但是增加Jenkins Slave又会引出新的问题,资源不能按需调度。Jobs少的时候资源闲置,而Jobs突然增多仍然会资源不足。我们希望能动态分配Jenkins Slave,即用即拿,用完即毁。这恰好符合K8S中Pod的特性。所以这里,我们在K8S中搭建一个Jenkins集群,并且是Jenkins Slave in Pod.
1.1 准备镜像
我们需要准备两个镜像,一个是Jenkins Master,一个是Jenkins Slave:
Jenkins Master
可根据实际需求定制Dockerfile
FROM jenkins/jenkins:latest
USER root
RUN cecho '' > /etc/apt/sources.list.d/jessie-backports.list \
&& echo "deb http://mirrors.aliyun.com/debian jessie main contrib non-free" > /etc/apt/sources.list \
&& echo "deb http://mirrors.aliyun.com/debian jessie-updates main contrib non-free" >> /etc/apt/sources.list \
&& echo "deb http://mirrors.aliyun.com/debian-security jessie/updates main contrib non-free" >> /etc/apt/sources.list
RUN apt-get update && apt-get install -y libltdl7 && apt-get clean
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl && \
chmod +x ./kubectl && \
mv ./kubectl /usr/local/bin/kubectl
RUN rm -rf /etc/localtime && cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && \
echo 'Asia/Shanghai' > /etc/timezone
ENV JAVA_OPTS="-Djenkins.install.runSetupWizard=false -Duser.timezone=Asia/Shanghai -Dhudson.model.DirectoryBrowserSupport.CSP=\"default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline';\""
Jenkins Salve
一般来说只需要安装kubelet就可以了
FROM jenkinsci/jnlp-slave
USER root
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl && \
chmod +x ./kubectl && \
mv ./kubectl /usr/local/bin/kubectl
生成镜像后可以push到自己的镜像仓库中备用
1.2 部署Jenkins Master
为了部署Jenkins、Jenkins Slave和后续的Elastic Search,建议ECS的最小内存为8G
在K8S上部署Jenkins的yaml参考如下:
apiVersion: v1
kind: Namespace
metadata:
name: jenkins-ci
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: jenkins-ci
namespace: jenkins-ci
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: jenkins-ci
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: jenkins-ci
namespace: jenkins-ci
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: jenkins-home
labels:
release: jenkins-home
namespace: jenkins-ci
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
path: /jenkins/jenkins-home
server: xxxx.nas.aliyuncs.com
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: jenkins-ssh
labels:
release: jenkins-ssh
namespace: jenkins-ci
spec:
capacity:
storage: 1Mi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
path: /jenkins/ssh
server: xxxx.nas.aliyuncs.com
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins-home-claim
namespace: jenkins-ci
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
selector:
matchLabels:
release: jenkins-home
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins-ssh-claim
namespace: jenkins-ci
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
selector:
matchLabels:
release: jenkins-ssh
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins
namespace: jenkins-ci
spec:
replicas: 1
template:
metadata:
labels:
name: jenkins
spec:
serviceAccount: jenkins-ci
containers:
- name: jenkins
imagePullPolicy: Always
image: xx.xx.xx/jenkins:1.0.0
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 0.5
memory: 1Gi
ports:
- containerPort: 8080
- containerPort: 50000
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 40
periodSeconds: 20
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker-sock
- mountPath: /usr/bin/docker
name: docker-bin
- mountPath: /var/jenkins_home
name: jenkins-home
- mountPath: /root/.ssh
name: jenkins-ssh
volumes:
- name: docker-sock
hostPath:
path: /var/run/docker.sock
- name: docker-bin
hostPath:
path: /opt/kube/bin/docker
- name: jenkins-home
persistentVolumeClaim:
claimName: jenkins-home-claim
- name: jenkins-ssh
persistentVolumeClaim:
claimName: jenkins-ssh-claim
---
kind: Service
apiVersion: v1
metadata:
name: jenkins-service
namespace: jenkins-ci
spec:
type: NodePort
selector:
name: jenkins
ports:
- name: jenkins-agent
port: 50000
targetPort: 50000
nodePort: 30001
- name: jenkins
port: 8080
targetPort: 8080
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: jenkins-ingress
namespace: jenkins-ci
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "0"
spec:
rules:
- host: xxx.xxx.com
http:
paths:
- path: /
backend:
serviceName: jenkins-service
servicePort: 8080
最后附一下SLB的配置

这样就可以通过域名xxx.xxx.com访问Jenkins,并且可以通过xxx.xxx.com:50000来链接集群外的Slave。当然,集群内的Slave直接通过serviceName-namespace:50000访问就可以了
1.3 配置Jenkins Slave
以管理员进入Jenkins,安装”Kubernetes”插件,然后进入系统设置界面,”Add a new cloud” – “Kubernetes”,配置如下:
- Kubernetes URL:https://kubernetes.default.svc.cluster.local
-
Jenkins URL:http://jenkins-service.jenkins-ci:8080
- Test Connection 测试看连接是否成功
- Images – Add Pod Template – Kubernetes Pod Template
-
注意设置Name为”jnlp-agent”,其他按需填写,设置完成后进入Advanced
-
根据需要设置资源管理,也就是说限制Jenkins Slave in Pod所占用的CPU和内存,详见第二章
-
设置Volume,同样采用docker outside docker,将K8S Node的docker为Jenkins Slave Pod所用;设置Jenkins Slave的工作目录为NAS
-
设置最多允许多少个Jenkins Slave Pod 同时运行,然后进入Advanced
-
填写Service Account,与部署Jenkins Master的yaml文件中的Service Account保持一致;如果你的Jenkins Slave Image是私有镜像,还需要设置ImagePullSecrets
- Apply并完成
1.4 测试验证
我们可以写一个FreeStyle Project的测试Job:

测试运行:
可以看到名为”jnlp-agent-xxxxx”的Jenkins Salve被创建,Job build完成后又消失,即为正确完成配置。


二、K8S资源管理
在第一章中,先后提到两次资源管理,一次是Jenkins Master的yaml,一次是Kubernetes Pod Template给Jenkins Slave 配置。Resource的控制是K8S的基础配置之一。但一般来说,用到最多的就是以下四个:
- Request CPU:意为某Node剩余CPU大于Request CPU,才会将Pod创建到该Node上
- Limit CPU:意为该Pod最多能使用的CPU为Limit CPU
- Request Memory:意为某Node剩余内存大于Request Memory,才会将Pod创建到该Node上
- Limit Memory:意为该Pod最多能使用的内存为Limit Memory
比如在我这个项目中,Gitlab至少需要配置Request Memory为3G,对于Elastic Search的Request Memory也至少为2.5 G.
其他服务需要根据K8S Dashboard中的监控插件结合长时间运行后给出一个合理的Resource控制范围。


三、Harbor
在K8S中跑CI,大致流程是Jenkins将Gitlab代码打包成Image,Push到Docker Registry中,随后Jenkins通过yaml文件部署应用,Pod的Image从Docker Registry中Pull.也就是说到目前为止,我们还缺一个Docker Registry才能准备好所有CI需要的基础软件。
利用阿里云的镜像仓库或者Docker HUB可以节省硬件成本,但考虑数据安全、传输效率和操作易用性,还是希望自建一个Docker Registry. 可选的方案并不多,官方提供的Docker Registry v2轻量简洁,vmware的Harbor功能更丰富。
Harbor提供了一个界面友好的UI,支持镜像同步,这对于DevOps尤为重要。Harbor官方提供了Helm方式在K8S中部署。但我考虑Harbor占用的资源较多,从节省硬件成本来说,把Harbor放到了K8S Master上(Master节点不会被调度用于部署Pod,所以大部分空间资源没有被利用)。当然这不是一个最好的方案,但它是最适合我们目前业务场景的方案。
在Master节点使用docker compose部署Harbor的步骤如下:
-
192.168.0.1安装docker-compose
pip install docker-compose
-
192.168.0.1 data目录挂载NAS路径(harbor的volume默认映射到宿主机的/data目录,所以我们把宿主机的/data目录挂载为NAS即可实现用NAS作为harbor的volume)
mkdir /data
mount -t nfs -o vers=4.0 xxx.xxx.com:/harbor /data
-
参考https://github.com/vmware/harbor/blob/master/docs/installation_guide.md 安装
- 根据需要下载指定版本的 Harbor offline installer
-
解压后配置harbor.cfg
hostname = xx.xx.com
ui_url_protocol = http
email_identity = rfc2595
email_server = xx
email_server_port = xx
email_username = xx
email_password = xx
email_from = xx
email_ssl = xx
email_insecure = xx
harbor_admin_password = xx
-
修改docker-compose.yaml中的端口映射,这里将容器端口映射到宿主机的23280端口
proxy:
image: vmware/nginx-photon:v1.5.0
container_name: nginx
restart: always
volumes:
- ./common/config/nginx:/etc/nginx:z
networks:
- harbor
ports:
- 23280:80
depends_on:
- mysql
- registry
- ui
- log
logging:
driver: "syslog"
options:
syslog-address: "tcp://127.0.0.1:1514"
tag: "proxy"
- 运行install.sh
-
修改 kubeasz的 roles/docker/files/daemon.json加入”insecure-registries”节点,如下所示
{
"registry-mirrors": ["https://kuamavit.mirror.aliyuncs.com", "https://registry.docker-cn.com", "https://docker.mirrors.ustc.edu.cn"],
"insecure-registries": ["192.168.0.1:23280"],
"max-concurrent-downloads": 10,
"log-driver": "json-file",
"log-level": "warn",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
重新安装kubeasz的docker
ansible-playbook 03.docker.yml
这样在集群内的任何一个节点就可以通过http协议192.168.0.1:23280 访问harbor
-
开机启动
vi /etc/rc.local
chmod +x /etc/rc.local
-
设置Secret(K8S部署应用时使用Secret拉取镜像,详见系列教程第三篇)在K8S集群任意一台机器使用命令
kubectl create secret docker-registry regcred --docker-server=192.168.0.1:23280 --docker-username=xxx --docker-password=xxx --docker-email=xxx
-
设置SLB(如果仅在内网使用,不设置SLB和DNS也可以)
- 登陆Harbor管理页面
- 在集群内通过
docker login 192.168.0.1:23280
验证Harbor是否创建成功
四、EFK
最后我们来给集群加上日志系统。
项目中常用的日志系统多数是Elastic家族的ELK,外加Redis或者Kafka作为缓冲队列。由于Logstash需要运行在java环境下,且占用空间大,配置相对复杂,随着Elastic家族的产品逐渐丰富,Logstash开始慢慢偏向日志解析、过滤、格式化等方面,所以并不太适合在容器环境下的日志收集。K8S官方给出的方案是EFK,其中F指的是Fluentd,一个用Ruby写的轻量级日志收集工具。对比Logstash来说,支持的插件少一些。
容器日志的收集方式不外乎以下四种:
- 容器外收集。将宿主机的目录挂载为容器的日志目录,然后在宿主机上收集。
- 容器内收集。在容器内运行一个后台日志收集服务。
- 单独运行日志容器。单独运行一个容器提供共享日志卷,在日志容器中收集日志。
- 网络收集。容器内应用将日志直接发送到日志中心,比如java程序可以使用log4j2转换日志格式并发送到远端。
- 通过修改docker的–log-driver。可以利用不同的driver把日志输出到不同地方,将log-driver设置为syslog、fluentd、splunk等日志收集服务,然后发送到远端。
docker默认的driver是json-driver,容器输出到控制台的日志,都会以 *-json.log 的命名方式保存在 /var/lib/docker/containers/ 目录下。所以EFK的日志策略就是在每个Node部署一个Fluentd,读取/var/lib/docker/containers/ 目录下的所有日志,传输到ES中。这样做有两个弊端,一方面不是所有的服务都会把log输出到控制台;另一方面不是所有的容器都需要收集日志。我们更想定制化的去实现一个轻量级的日志收集。所以综合各个方案,还是采取了网上推荐的以FileBeat作为日志收集的“EFK”架构方案。
FileBeat用Golang编写,输出为二进制文件,不存在依赖。占用空间极小,吞吐率高。但它的功能相对单一,仅仅用来做日志收集。所以对于有需要的业务场景,可以用FileBeat收集日志,Logstash格式解析,ES存储,Kibana展示。
使用FileBeat收集容器日志的业务逻辑如下:

也就是说我们利用K8S的Pod的临时目录{}来实现Container的数据共享,举个例子:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: test
labels:
app: test
spec:
replicas: 2
strategy:
type: Recreate
template:
metadata:
labels:
app: test
spec:
containers:
- image:
name: app
volumeMounts:
- name: log-volume
mountPath: /var/log/app/
- image:
name: filebeat
args: [
"-c", "/etc/filebeat.yml"
]
securityContext:
runAsUser: 0
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: log-volume
mountPath: /var/log/container/
volumes:
- name: config
configMap:
defaultMode: 0600
name: filebeat-config
- name: log-volume
emptyDir: {}
imagePullSecrets:
- name: regcred
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: test
labels:
app: filebeat
data:
filebeat.yml: |-
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/container/*.log
output.elasticsearch:
hosts: ["xx.xx.xx:9200"]
tags: ["test"]
实现这种FileBeat作为日志收集的“EFK”系统,只需要在K8S集群中搭建好ES和Kibana即可,FileBeat是随着应用一起创建,无需提前部署。搭建ES和Kibana的方式可参考K8S官方文档,我也进行了一个简单整合:
ES:
apiVersion: v1
kind: ServiceAccount
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: elasticsearch-logging
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- "services"
- "namespaces"
- "endpoints"
verbs:
- "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: kube-system
name: elasticsearch-logging
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
name: elasticsearch-logging
namespace: kube-system
apiGroup: ""
roleRef:
kind: ClusterRole
name: elasticsearch-logging
apiGroup: ""
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: es-pv-0
labels:
release: es-pv
namespace: kube-system
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "es-storage-class"
nfs:
path: /es/0
server: xxx.nas.aliyuncs.com
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: es-pv-1
labels:
release: es-pv
namespace: kube-system
spec:
capacity:
storage: 20Gi
accessModes:
- ReadWriteMany
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Recycle
storageClassName: "es-storage-class"
nfs:
path: /es/1
server: xxx.nas.aliyuncs.com
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
version: v5.6.4
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
serviceName: elasticsearch-logging
replicas: 2
selector:
matchLabels:
k8s-app: elasticsearch-logging
version: v5.6.4
template:
metadata:
labels:
k8s-app: elasticsearch-logging
version: v5.6.4
kubernetes.io/cluster-service: "true"
spec:
serviceAccountName: elasticsearch-logging
containers:
- image: registry-vpc.cn-shenzhen.aliyuncs.com/heygears/elasticsearch:5.6.4
name: elasticsearch-logging
resources:
limits:
cpu: 1
memory: 2.5Gi
requests:
cpu: 0.8
memory: 2Gi
ports:
- containerPort: 9200
name: db
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- name: elasticsearch-logging
mountPath: /data
env:
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
initContainers:
- image: alpine:3.6
command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
name: elasticsearch-logging-init
securityContext:
privileged: true
volumeClaimTemplates:
- metadata:
name: elasticsearch-logging
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: "es-storage-class"
resources:
requests:
storage: 20Gi
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "Elasticsearch"
spec:
type: NodePort
ports:
- port: 9200
protocol: TCP
targetPort: db
nodePort: xxx
selector:
k8s-app: elasticsearch-logging
Kibana:
apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana-logging
namespace: kube-system
labels:
k8s-app: kibana-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
replicas: 1
selector:
matchLabels:
k8s-app: kibana-logging
template:
metadata:
labels:
k8s-app: kibana-logging
spec:
containers:
- name: kibana-logging
image: registry-vpc.cn-shenzhen.aliyuncs.com/heygears/kibana:5.6.4
resources:
limits:
cpu: 1
memory: 1.5Gi
requests:
cpu: 0.8
memory: 1.5Gi
env:
- name: ELASTICSEARCH_URL
value: http://elasticsearch-logging:9200
- name: SERVER_BASEPATH
value: /api/v1/namespaces/kube-system/services/kibana-logging/proxy
- name: XPACK_MONITORING_ENABLED
value: "false"
- name: XPACK_SECURITY_ENABLED
value: "false"
ports:
- containerPort: 5601
name: ui
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: kibana-logging
namespace: kube-system
labels:
k8s-app: kibana-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "Kibana"
spec:
ports:
- port: 5601
protocol: TCP
targetPort: ui
selector:
k8s-app: kibana-logging
本文转自中文社区-阿里云Kubernetes实战2–搭建基础服务