介绍:
CPFS(Cloud Paralleled File System)是一种并行文件系统。CPFS 的数据存储在集群中的多个数据节点,并可由多个客户端同时访问,从而能够为大型高性能计算机集群提供高IOPS、高吞吐、低时延的数据存储服务。
CPFS详细产品介绍参考:
https://help.aliyun.com/product/111536.html
CPFS是共享存储服务类型,适合于容器服务场景对资源共享、高性能的要求,在大数据、AI、基因计算等高性能场景中使用容器服务 + CPFS是一个推荐的解决方案。
本文介绍如何在容器服务中安装Flexvolume插件,并通过CPFS数据卷的方式为应用(Pod)提供CPFS服务。
CSI中如何使用CPFS服务请参考:https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/docs/cpfs.md
插件部署:
1. 限制:
容器服务中使用CPFS依赖两个驱动类组件:CPFS容器驱动 和 CPFS客户端驱动;
CPFS容器驱动:即为Flexvolume-cpfs插件,对centos各个版本都是兼容的,部署Flexvolume-cpfs即可完成安装;
CPFS客户端驱动:为CPFS挂载时的客户端(类似于nfs-client),驱动与操作系统内核是强依赖。容器场景中安装CPFS客户端驱动有几种方案:
手动安装驱动,参考https://help.aliyun.com/document_detail/131060.html
Flexvolume-cpfs部署时自动安装驱动,但只支持部分内核版本,
目前容器场景下支持在以下内核版本安装CPFS客户端驱动:
3.10.0-957.5.1
3.10.0-957.21.3
3.10.0-1062.9.1
可以通过在节点上执行: uname -r 查看内核版本。
目前Flexvolume只支持安装CPFS Client驱动,不支持cpfs-client驱动升级,即发现节点安装了cpfs客户端,不再继续安装驱动;
升级Flexvolume版本,只会升级Flexvolume驱动(容器驱动),而不会升级cpfs-client版本;
在已经部署了cpfs-client、lustre驱动的节点上安装cpfs flexvolume不会再安装新版本的CPFS-Client;
Client升级需要手动进行,参考cpfs使用文档(https://help.aliyun.com/document_detail/131060.html);
2. 部署模板:
在集群中执行kubectl命令部署下面模板:
# kubectl create -f flexvolume-cpfs.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: flexvolume-cpfs
namespace: kube-system
labels:
k8s-volume: flexvolume-cpfs
spec:
selector:
matchLabels:
name: acs-flexvolume-cpfs
template:
metadata:
labels:
name: acs-flexvolume-cpfs
spec:
hostPID: true
hostNetwork: true
tolerations:
- operator: "Exists"
priorityClassName: system-node-critical
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: type
operator: NotIn
values:
- virtual-kubelet
containers:
- name: acs-flexvolume
image: registry.cn-hangzhou.aliyuncs.com/acs/flexvolume:v1.14.8.71-22f141a-aliyun
imagePullPolicy: Always
securityContext:
privileged: true
env:
- name: ACS_CPFS
value: "true"
- name: FIX_ISSUES
value: "false"
livenessProbe:
exec:
command:
- sh
- -c
- ls /acs/flexvolume
failureThreshold: 8
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 15
volumeMounts:
- name: usrdir
mountPath: /host/usr/
- name: etcdir
mountPath: /host/etc/
- name: logdir
mountPath: /var/log/alicloud/
- mountPath: /var/lib/kubelet
mountPropagation: Bidirectional
name: kubeletdir
volumes:
- name: usrdir
hostPath:
path: /usr/
- name: etcdir
hostPath:
path: /etc/
- name: logdir
hostPath:
path: /var/log/alicloud/
- hostPath:
path: /var/lib/kubelet
type: Directory
name: kubeletdir
updateStrategy:
type: RollingUpdate
3. 检查部署情况:
在集群中查看存储插件的部署情况,示例如下:
# kubectl get pod -nkube-system | grep flex
flexvolume-97psk 1/1 Running 0 27m
flexvolume-cpfs-dgxfq 1/1 Running 0 98s
flexvolume-cpfs-qpbcb 1/1 Running 0 98s
flexvolume-cpfs-vlrf9 1/1 Running 0 98s
flexvolume-cpfs-wklls 1/1 Running 0 98s
flexvolume-cpfs-xtl9b 1/1 Running 0 98s
flexvolume-j8zjr 1/1 Running 0 27m
flexvolume-pcg4l 1/1 Running 0 27m
flexvolume-tjxxn 1/1 Running 0 27m
flexvolume-x7ljw 1/1 Running 0 27m
以flexvolume-cpfs 开头的pod表示部署的cpfs存储卷插件;
不含cpfs字样的flexvolume pod表示:集群默认部署的nas、云盘、oss存储卷插件,两个插件可以同时部署;
在集群的节点上查看cpfs-client是否安装完成:
# rpm -qa | grep cpfs
kmod-cpfs-client-2.10.8-202.el7.x86_64
cpfs-client-2.10.8-202.el7.x86_64
查看 mount.lustre 是否已经安装:
# which mount.lustre
/usr/sbin/mount.lustre
使用CPFS数据卷:
在ACK中使用CPFS数据卷,需要您先到CPFS控制台创建一个CPFS卷和挂载点,参考:https://help.aliyun.com/document_detail/111860.html
创建CPFS挂载点时,选择的vpc网络需要和ACK集群在同一个vpc内。
下面示例假设获取挂载点为:
挂载点:cpfs-*-alup.cn-shenzhen.cpfs.nas.aliyuncs.com@tcp:cpfs--ws5v.cn-shenzhen.cpfs.nas.aliyuncs.com@tcp
文件系统ID为:0237ef41
1. PV模板:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-cpfs
labels:
alicloud-pvname: pv-cpfs
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
flexVolume:
driver: "alicloud/cpfs"
options:
server: "cpfs-****-alup.cn-shenzhen.cpfs.nas.aliyuncs.com@tcp:cpfs-***-ws5v.cn-shenzhen.cpfs.nas.aliyuncs.com@tcp"
fileSystem: "0237ef41"
subPath: "/k8s"
options: "ro"
其中:
server:配置为CPFS的挂载点;
fileSystem:配置为CPFS文件系统ID;
subPath:配置为期望挂载的CPFS子目录,相对于文件系统根目录;
options:可选,挂载配置选项;
2. PVC、应用模板:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-cpfs
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
selector:
matchLabels:
alicloud-pvname: pv-cpfs
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nas-cpfs
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: pvc-cpfs
mountPath: "/data"
volumes:
- name: pvc-cpfs
persistentVolumeClaim:
claimName: pvc-cpfs
3. 创建应用:
创建上面模板后检查pod挂载情况:
# kubectl get pod
NAME READY STATUS RESTARTS AGE
nas-cpfs-79964997f5-kzrtp 1/1 Running 0 45s
进入Pod查看挂载目录;
# kubectl exec -ti nas-cpfs-79964997f5-kzrtp sh
# mount | grep k8s
192.168.1.12@tcp:192.168.1.10@tcp:/0237ef41/k8s on /data type lustre (ro,lazystatfs)
进入pod所在节点,查看挂载目录;
# mount | grep cpfs
192.168.1.12@tcp:192.168.1.10@tcp:/0237ef41/k8s on /var/lib/kubelet/pods/c4684de2-26ce-11ea-abbd-00163e12e203/volumes/alicloud~cpfs/pv-cpfs type lustre (ro,lazystatfs)