容器开启数据服务之旅系列(三)
(三):Kubernetes QoS助力在线运用与大数据离线运用的混部
概述
本文是2018年大数据峰会上的一些分享,关于在线业务,离线业务在ACK(阿里云容器服务Kubernetes)的平台上通过对namespace, cgroup, quota的灵活组合完成在线,离线业务场景的混合部署,来提高总体资源的使用率,以及支资源限制动态分配调整,来伸缩离线部分的资源水位。结合HPA和资源监控,可以完成自动化的离线资源挤出。
在线业务Web应用/数据库
QoS class: Guaranteed:
limit = request
离线业务 Spark/MapReduce/Deep Learning
QoS class:Burstable
request < limit
通过namespace的Guaranteed QoS设计,实现在线业务 Web 应用以及数据库的性能保证
qosClass: Guaranteed
resources:
requests:
cpu: 300m
memory: 512Mi
limits:
cpu: 300m
memory: 512Mi
使用命名空间(cgroup)完成隔离和资源控制
cpu: 12
memory: 16Gi
Name: online
Labels: <none>
Annotations: <none>
Status: Active
Resource Quotas
Name: quota
Resource Used Hard
-------- --- ---
configmaps 1 100
cpu 550m 12
memory 768Mi 16Gi
persistentvolumeclaims 1 100
pods 2 100
replicationcontrollers 0 10
requests.storage 20Gi 1024G
secrets 3 100
services 2 10
Resource Limits
Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio
---- -------- --- --- --------------- ------------- -----------------------
Container cpu - - 100m 4 -
Container memory - - 256Mi 16Gi -
使用namespace的Burstable QoS设计,实施离线大数据运算的资源可伸缩控制
QoS Class: Burstable
resources:
requests:
cpu: "100m"
memory: "512Mi"
命名空间隔离和资源控制
cpu: 4
memory: 32Gi
Name: batch
Labels: <none>
Annotations: <none>
Status: Active
Resource Quotas
Name: quota
Resource Used Hard
-------- --- ---
configmaps 0 100
cpu 400m 4
memory 2Gi 32Gi
persistentvolumeclaims 0 100
pods 4 100
replicationcontrollers 0 10
requests.storage 0 1024G
secrets 1 100
services 2 10
Resource Limits
Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio
---- -------- --- --- --------------- ------------- -----------------------
Container cpu - - 100m 1 -
Container memory - - 256Mi 16Gi - -