开发者社区> 萧元> 正文

阿里云容器服务Kubernetes 基于GPU指标自动伸缩

简介: ### 基于GPU的指标扩缩容 在深度学习训练中,训练完成的模型,通过Serving服务提供模型服务。本文介绍如何构建弹性自动伸缩的Serving服务。 Kubernetes 支持HPA模块进行容器伸缩,默认支持CPU和内存等指标。
+关注继续查看

基于GPU的指标扩缩容

在深度学习训练中,训练完成的模型,通过Serving服务提供模型服务。本文介绍如何构建弹性自动伸缩的Serving服务。

Kubernetes 支持HPA模块进行容器伸缩,默认支持CPU和内存等指标。原生的HPA基于Heapster,不支持GPU指标的伸缩,但是支持通过CustomMetrics的方式进行HPA指标的扩展。我们可以通过部署一个基于Prometheus Adapter 作为CustomMetricServer,它能将Prometheus指标注册的APIServer接口,提供HPA调用。 通过配置,HPA将CustomMetric作为扩缩容指标, 可以进行GPU指标的弹性伸缩。

前提

您需要创建一个容器服务Kubernets集群,并完成GPU监控部分的部署 阿里云容器Kubernetes监控- GPU监控, 完成部署Promethues用于监控GPU使用指标,我们将通过Prometheus 里的监控数据作为参考指标进行弹性伸缩。

注意

当HPA配置自定义监控指标进行伸缩指标后, 将无法使用原生HPA基于Heapster的CPU和Memory的伸缩。

部署

登录master上执行脚本,生成Prometheus Adapter的证书

#!/usr/bin/env bash
set -e
set -o pipefail
set -u
b64_opts='--wrap=0'
# go get -v -u github.com/cloudflare/cfssl/cmd/...

export PURPOSE=metrics
openssl req -x509 -sha256 -new -nodes -days 365 -newkey rsa:2048 -keyout ${PURPOSE}-ca.key -out ${PURPOSE}-ca.crt -subj "/CN=ca"
echo '{"signing":{"default":{"expiry":"43800h","usages":["signing","key encipherment","'${PURPOSE}'"]}}}' > "${PURPOSE}-ca-config.json"

export SERVICE_NAME=custom-metrics-apiserver
export ALT_NAMES='"custom-metrics-apiserver.monitoring","custom-metrics-apiserver.monitoring.svc"'
echo "{\"CN\":\"${SERVICE_NAME}\", \"hosts\": [${ALT_NAMES}], \"key\": {\"algo\": \"rsa\",\"size\": 2048}}" | \
           cfssl gencert -ca=metrics-ca.crt -ca-key=metrics-ca.key -config=metrics-ca-config.json - | cfssljson -bare apiserver

cat <<-EOF > cm-adapter-serving-certs.yaml
apiVersion: v1
kind: Secret
metadata:
  name: cm-adapter-serving-certs
data:
  serving.crt: $(base64 ${b64_opts} < apiserver.pem)
  serving.key: $(base64 ${b64_opts} < apiserver-key.pem)
EOF

kubectl -n kube-system apply -f cm-adapter-serving-certs.yaml

部署Prometheus CustomMetric Adapter

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: custom-metrics-apiserver
  name: custom-metrics-apiserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: custom-metrics-apiserver
  template:
    metadata:
      labels:
        app: custom-metrics-apiserver
      name: custom-metrics-apiserver
    spec:
      serviceAccountName: custom-metrics-apiserver
      containers:
      - name: custom-metrics-apiserver
        image: registry.cn-beijing.aliyuncs.com/test-hub/k8s-prometheus-adapter-amd64
        args:
        - --secure-port=6443
        - --tls-cert-file=/var/run/serving-cert/serving.crt
        - --tls-private-key-file=/var/run/serving-cert/serving.key
        - --logtostderr=true
        - --prometheus-url=http://prometheus-svc.kube-system.svc.cluster.local:9090/
        - --metrics-relist-interval=1m
        - --v=10
        - --config=/etc/adapter/config.yaml
        ports:
        - containerPort: 6443
        volumeMounts:
        - mountPath: /var/run/serving-cert
          name: volume-serving-cert
          readOnly: true
        - mountPath: /etc/adapter/
          name: config
          readOnly: true
        - mountPath: /tmp
          name: tmp-vol
      volumes:
      - name: volume-serving-cert
        secret:
          secretName: cm-adapter-serving-certs
      - name: config
        configMap:
          name: adapter-config
      - name: tmp-vol
        emptyDir: {}
---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: custom-metrics-apiserver
---
apiVersion: v1
kind: Service
metadata:
  name: custom-metrics-apiserver
spec:
  ports:
  - port: 443
    targetPort: 6443
  selector:
    app: custom-metrics-apiserver
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: custom-metrics-server-resources
rules:
- apiGroups:
  - custom.metrics.k8s.io
  resources: ["*"]
  verbs: ["*"]
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: adapter-config
data:
  config.yaml: |
    rules:
    - seriesQuery: '{uuid!=""}'
      resources:
        overrides:
          node_name: {resource: "node"}
          pod_name: {resource: "pod"}
          namespace_name: {resource: "namespace"}
      name:
        matches: ^nvidia_gpu_(.*)$
        as: "${1}_over_time"
      metricsQuery: ceil(avg_over_time(<<.Series>>{<<.LabelMatchers>>}[3m]))
    - seriesQuery: '{uuid!=""}'
      resources:
        overrides:
          node_name: {resource: "node"}
          pod_name: {resource: "pod"}
          namespace_name: {resource: "namespace"}
      name:
        matches: ^nvidia_gpu_(.*)$
        as: "${1}_current"
      metricsQuery: <<.Series>>{<<.LabelMatchers>>}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: custom-metrics-resource-reader
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  - services
  verbs:
  - get
  - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: hpa-controller-custom-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: custom-metrics-server-resources
subjects:
- kind: ServiceAccount
  name: horizontal-pod-autoscaler
  namespace: kube-system
角色授权, 如果使用kube-system以外的命名空间, 需要修改模板中的namespace字段:
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
  name: v1beta1.custom.metrics.k8s.io
  namespace: kube-system
spec:
  service:
    name: custom-metrics-apiserver
    namespace: kube-system # 如果部署kube-system以外的Namespace 需要修改此处
  group: custom.metrics.k8s.io
  version: v1beta1
  insecureSkipTLSVerify: true
  groupPriorityMinimum: 100
  versionPriority: 100
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: custom-metrics-resource-reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: custom-metrics-resource-reader
subjects:
- kind: ServiceAccount
  name: custom-metrics-apiserver
  namespace: kube-system # 如果部署kube-system 以外的Namespace 需要修改此处
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: custom-metrics:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: custom-metrics-apiserver
  namespace: kube-system # 如果部署kube-system 以外的Namespace 需要修改此处
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: custom-metrics-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: custom-metrics-apiserver
  namespace: kube-system

部署完成后,可以通过customMetric的ApiServer调用,验证Prometheus Adapter部署成功

# kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/temperature_celsius_current"
{"kind":"MetricValueList","apiVersion":"custom.metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/temperature_celsius_current"},"items":[]}

修改controller-manager配置,使用CustomMetric 作为hpa伸缩指标

登录到三个master上,分别执行脚本,修改ApiServer的HPA配置

sed -i 's/--horizontal-pod-autoscaler-use-rest-clients=false/--horizontal-pod-autoscaler-use-rest-clients=true/g' /etc/kubernetes/manifests/kube-controller-manager.yaml

检测修改结果

# kubectl -n kube-system describe po -l component=kube-controller-manager | grep 'horizontal-pod-autoscaler-use-rest-clients'

      --horizontal-pod-autoscaler-use-rest-clients=true
      --horizontal-pod-autoscaler-use-rest-clients=true
      --horizontal-pod-autoscaler-use-rest-clients=true

伸缩指标

至此,我们已经部署了一个Prometheus 的CustomMetric Server, 我们通过adapter-config这个configMap配置Prometheus 提供暴露给ApiServer 的指标
支持以下GPU指标:

Prometheus指标 含义 HPA指标 HPA指标(3分钟平均值)
nvidia_gpu_duty_cycle GPU使用率 nvidia_gpu_duty_cycle_current nvidia_gpu_duty_cycle_over_time
nvidia_gpu_memory_total_bytes GPU总内存 nvidia_gpu_memory_total_bytes_current nvidia_gpu_memory_total_bytes_over_time
nvidia_gpu_memory_used_bytes GPU已分配内存 nvidia_gpu_memory_used_bytes_current nvidia_gpu_memory_used_bytes_over_time
nvidia_gpu_power_usage_milliwatts GPU耗电量 nvidia_gpu_power_usage_milliwatts_current nvidia_gpu_power_usage_milliwatts_over_time
nvidia_gpu_temperature_celsius GPU温度 temperature_celsius_current temperature_celsius_over_time

使用GPU指标进行自动伸缩

部署一个deployment

apiVersion: v1
kind: Service
metadata:
  name:  fast-style-transfer-serving
  labels:
    app: tensorflow-serving
spec:
  ports:
    - name: http-serving
      port: 5000
      targetPort: 5000
  selector:
    app: tensorflow-serving
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: fast-style-transfer-serving
  labels:
    app: tensorflow-serving
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: tensorflow-serving
    spec:
      containers:
        - name: serving
          image: "registry.cn-hangzhou.aliyuncs.com/tensorflow-samples/fast-style-transfer-serving:la_muse"
          command: ["python", "app.py"]
          resources:
            limits:
              nvidia.com/gpu: 1

创建一个基于GPU指标伸缩的HPA

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
  name: gpu-hpa
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: fast-style-transfer-serving
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metricName: duty_cycle_current # 指标为pod的平均GPU使用率
      targetAverageValue: 40

查看HPA的指标以及指标值

# kubectl get hpa
NAME      REFERENCE                                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
gpu-hpa   Deployment/fast-style-transfer-serving   0 / 40    1         10        1          37s

部署一个fast-style-transfer的压测应用

这个应用会不断向serving发送图片,用于模拟压力测试

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: fast-style-transfer-press
  labels:
    app: fast-style-transfer-press
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: fast-style-transfer-press
    spec:
      containers:
        - name: serving
          image: "registry.cn-hangzhou.aliyuncs.com/xiaozhou/fast-style-transfer-press:v0"
          env:
            - name: SERVER_IP
              value: fast-style-transfer-serving
            - name: BATCH_SIZE
              value: "100"
            - name: TOTAL_SIZE
              value: "12000"

压测部署完成后,可以在监控面板的【GPU应用监控】看到指标变化

image.png | left | 398x234

也能够通过HPA看到指标变化

# kubectl get hpa
NAME             REFERENCE                 TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
sample-gpu-hpa   Deployment/demo-service   63 / 30    1         10        1          3m

压测一段时间后可以看到pod扩容

NAME                                           READY     STATUS    RESTARTS   AGE
fast-style-transfer-press-69c48966d8-dqf5n     1/1       Running   0          4m
fast-style-transfer-serving-84587c94b7-7xp2d   1/1       Running   0          5m
fast-style-transfer-serving-84587c94b7-slbdn   1/1       Running   0          47s

监控界面也可以看到扩容的的pod以及GPU指标:

image.png | left | 434x253

将压测容器停止

执行以下命令,将压测应用停止:

kubectl scale deploy fast-style-transfer-press --replicas=0 # 将压测应用容器缩容为0

(也可以在控制台上执行部署伸缩操作)

在HPA上检查dutyCycle指标变化为0

kubectl get hpa
NAME      REFERENCE                                TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
gpu-hpa   Deployment/fast-style-transfer-serving   0 / 40    1         10        3          9m

一段时间后检查容器是否成功缩容

kubectl get po
NAME                                           READY     STATUS    RESTARTS   AGE
fast-style-transfer-serving-84587c94b7-7xp2d   1/1       Running   0          10m

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
基于阿里云服务器的我在校园班委自动管理系统
利用python构造程序放置于阿里云服务器,实现我在校园固定时间点,自动通过邮箱提醒未完成日检日报、签到的同学完成打卡,以及程序的实现过程。
114 0
干货|数据库自治服务DAS首创SQL请求行为识别功能,全自动定位SQL异常
DAS(Database autonomy service)为上百万数据库实例的稳定运行保驾护航,其中精准定位数据库运行过程中的异常SQL是DAS最基本的功能。数据库90%以上的问题都来源于数据库的异常请求,无论是双十一的集团海量交易请求行为,还是用户业务变化导致的请求行为变化,每时每刻都影响着数据库的性能。
341 0
阿里云容器服务Kubernetes实现应用自动部署
## 前言 CICD是研发效率提升必不可少的一环, 要提高迭代效率,就要减少开发到部署中等待和人工操作的时间与步骤。 通过容器以及周边产品集成,我们更能将代码开发完成到部署时间极大缩短, 并将一切手工操作自动化。
5425 0
阿里云Kubernetes容器服务上体验Knative
Knative Serving是一种可缩放至零、请求驱动的计算运行环境,构建在 Kubernetes 和 Istio 之上,支持为 serverless 应用、函数提供部署与服务。Knative Serving的目标是为Kubernetes提供扩展功能,用于部署和运行无服务器工作负载。
5072 0
深度解析容器服务Kubernetes集群容量以及网络规划
#背景 在目前云原生技术被如火如荼的大规模使用的过程中。越来越多的用户都会使用Kubernetes集群去部署其应用。但是在这个过程中,如果由于早期对于容量和网络的规划不当,可能造成实际生产中实践中,不能满足业务的真实需要。如果此时在重新规划就面临着集群重建、应用迁移的诸多事项,这样不仅仅浪费了大量的精力,甚至可能会造成业务有一定的中断。因此,为了使得广大使用者可以更加深入的理解阿里云容器服务Ku
1005 0
CICD联动阿里云容器服务Kubernetes实践之CodePipeline篇
通过CodePipeline可以构建您的代码工作流模板,配置从应用编译到容器镜像构建和推送,再到Kubernetes应用的发布,打通代码应用发布全过程自动化。
3446 0
CodePipeline流水线实现自动发布Serverless Kubernetes
本文档以构建一个 Java 软件项目并部署到 阿里云容器服务Serverless Kubernetes集群 为例说明如何使用 CodePipeline。 使用说明 开通使用 CodePipeline 产品。
4448 0
+关注
萧元
Hello
20
文章
0
问答
文章排行榜
最热
最新
相关电子书
更多
OceanBase 入门到实战教程
立即下载
阿里云图数据库GDB,加速开启“图智”未来.ppt
立即下载
实时数仓Hologres技术实战一本通2.0版(下)
立即下载