kubernetes中部署的应用的信息都存放在etcd里面,这里面的数据非常重要,需要备份,以备不时之需。
这里使用k8s提供的定时任务来执行备份任务,定时任务的pod要和etcd的pod要在同一个node上面(使用nodeAffinity)。
备份etcd数据
apiVersion: batch/v2alpha1
kind: CronJob
metadata:
name: etcd-disaster-recovery
namespace: cron
spec:
schedule: "0 22 * * *"
jobTemplate:
spec:
template:
metadata:
labels:
app: etcd-disaster-recovery
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/role
operator: In
values:
- master
containers:
- name: etcd
image: coreos/etcd:v3.0.17
command:
- sh
- -c
- "export ETCDCTL_API=3; \
etcdctl --endpoints $ENDPOINT snapshot save /snapshot/$(date +%Y%m%d_%H%M%S)_snapshot.db; \
echo etcd backup sucess"
env:
- name: ENDPOINT
value: "127.0.0.1:2379"
volumeMounts:
- mountPath: "/snapshot"
name: snapshot
subPath: data/etcd-snapshot
- mountPath: /etc/localtime
name: lt-config
- mountPath: /etc/timezone
name: tz-config
restartPolicy: OnFailure
volumes:
- name: snapshot
persistentVolumeClaim:
claimName: cron-nas
- name: lt-config
hostPath:
path: /etc/localtime
- name: tz-config
hostPath:
path: /etc/timezone
hostNetwork: true
恢复etcd数据
在izbp10mfzkjb2hv7ayu190z 的操作如下,其他两个node(izbp10mfzkjb2hv7ayu191z 、izbp10mfzkjb2hv7ayu192z )操作同理。
1. 先停止本机上的etcd和apiserver
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv /etc/kubernetes/manifests/etcd.yaml ~/etcd_restore/manifests_backup
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv /etc/kubernetes/manifests/kube-apiserver.yaml ~/etcd_restore/manifests_backup
确认ectd、api容器已经exit了
[root@izbp1ijmrejjh7t2wv7fi0z~]# docker ps -a | grep -E ".*(etcd|kube-api).*kube-system.*"
如有有数据输出则执行下面的命令
[root@izbp1ijmrejjh7t2wv7fi0z~]# systemctl restart kubelet
2. 恢复etcd备份数据
[root@izbp1ijmrejjh7t2wv7fi0z~]# rm -rf /var/lib/etcd/member
[root@izbp1ijmrejjh7t2wv7fi0z~]#
ETCDCTL_API=3 etcdctl snapshot restore /mnt/nas/data/etcd-snapshot/20170915_snapshot.db \
--name etcd-master --initial-cluster etcd-master=http://master.k8s:2380,etcd-master1=http://master1.k8s:2380,etcd-master2=http://master2.k8s:2380 \
--initial-cluster-token etcd-cluster \
--initial-advertise-peer-urls http://master.k8s:2380 \
--data-dir /var/lib/etcd
注意:
这里的每个参数可能会因宿主机不同而不同,这里需与每个宿主机的/etc/kubernetes/manifests/etcd.yaml相应的参数保持一致
这里是把数据恢复到宿主机的/var/lib/etcd目录,因为在第4步起的etcd容器会挂载本目录。
3. 启动etcd、apiserver
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv ~/etcd_restore/manifests_backup/etcd.yaml /etc/kubernetes/manifests/etcd.yaml
[root@izbp1ijmrejjh7t2wv7fi0z~]# mv ~/etcd_restore/manifests_backup/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml
验证etcd和apiserver是否已经UP了
[root@izbp1ijmrejjh7t2wv7fi0z etcd-snapshot]# kubectl get pod -n kube-system | grep -E ".*(etcd|kube-api).*"
etcd-izbp1ijmrejjh7t2wv7fhyz 1/1 Running 879 23d
etcd-izbp1ijmrejjh7t2wv7fhzz 1/1 Running 106 1d
etcd-izbp1ijmrejjh7t2wv7fi0z 1/1 Running 101 2d
kube-apiserver-izbp1ijmrejjh7t2wv7fhyz 1/1 Running 1 2d
kube-apiserver-izbp1ijmrejjh7t2wv7fhzz 1/1 Running 6 1d
kube-apiserver-izbp1ijmrejjh7t2wv7fi0z 1/1 Running 0 2d
4. 验证kube-system下面的所有pod、Node下的kubelet服务日志没有错误信息。
验证所有命名空间下的应用是否起来了。