🎹 个人简介:大家好,我是 金鱼哥,CSDN运维领域新星创作者,华为云·云享专家,阿里云社区·专家博主
📚个人资质: CCNA、HCNP、CSNA(网络分析师),软考初级、中级网络工程师、RHCSA、RHCE、RHCA、RHCI、ITIL😜
💬格言:努力不一定成功,但要想成功就必须努力🔥🎈支持我:可点赞👍、可收藏⭐️、可留言📝
📜资源限制
📑pod资源限制
pod可以包括资源请求和资源限制:
- 资源请求
用于调度,并控制pod不能在计算资源少于指定数量的情况下运行。调度程序试图找到一个具有足够计算资源的节点来满足pod请求。
- 资源限制
用于防止pod耗尽节点的所有计算资源,基于pod的节点配置Linux内核cgroups特性,以执行pod的资源限制。
尽管资源请求和资源限制是pod定义的一部分,但通常建议在dc中设置。OpenShift推荐的实践规定,不应该单独创建pod,而应该由dc创建。
📑应用配额
OCP可以执行跟踪和限制两种资源使用的配额:
- 对象的数量:Kubernetes资源的数量,如pod、service和route。
- 计算资源:物理或虚拟硬件资源的数量,如CPU、内存和存储容量。
通过避免master的Etcd数据库的无限制增长,对Kubernetes资源的数量设置配额有助于OpenShift master服务器的稳定性。对Kubernetes资源设置配额还可以避免耗尽其他有限的软件资源,比如服务的IP地址。
同样,对计算资源的数量施加配额可以避免耗尽OpenShift集群中单个节点的计算能力。还避免了一个应用程序使用所有集群容量,从而影响共享集群的其他应用程序。
OpenShift通过使用ResourceQuota对象或简单的quota来管理对象使用的配额及计算资源。
ResourceQuota对象指定项目的硬资源使用限制。配额的所有属性都是可选的,这意味着任何不受配额限制的资源都可以无限制地使用。
注意:一个项目可以包含多个ResourceQuota对象,其效果是累积的,但是对于同一个项目,两个不同的 ResourceQuota 不会试图限制相同类型的对象或计算资源。
📑ResourceQuota限制资源
下表显示 ResourceQuota 可以限制的主要对象和计算资源:
对象名 | 描述 |
---|---|
pods | 总计的pod数量 |
replicationcontrollers | 总计的rc数量 |
services | 总计的service数量 |
secrets | 总计的secret数量 |
persistentvolumeclaims | 总计的pvc数量 |
cpu | 所有容器的CPU使用总量 |
memory | 所有容器的总内存使用 |
storage | 所有容器的磁盘总使用量 |
Quota属性可以跟踪项目中所有pod的资源请求或资源限制。默认情况下,配额属性跟踪资源请求。要跟踪资源限制,可以在计算资源名称前面加上限制,例如limit.cpu。
示例一:使用YAML语法定义的ResourceQuota资源,它为对象计数和计算资源指定了配额:
$ cat
apiVersion: v1
kind: ResourceQuota
metadata:
name: dev-quota
spec:
hard:
services: "10"
cpu: "1300m"
memory: "1.5Gi"
$ oc create -f dev-quota.yml
示例二:使用oc create quota命令创建:
$ oc create quota dev-quota \
--hard=services=10 \
--hard=cpu=1300m \
--hard=memory=1.5Gi
$ oc get resourcequota #列出可用的配额
$ oc describe resourcequota NAME #查看与配额中定义的任何与限制相关的统计信息
$ oc delete resourcequota compute-quota #按名称删除活动配额
提示:若oc describe resourcequota命令不带参数,只显示项目中所有resourcequota对象的累积限制集,而不显示哪个对象定义了哪个限制。
当在项目中首次创建配额时,项目将限制创建任何可能超出配额约束的新资源的能力,然后重新计算资源使用情况。在创建配额和使用数据统计更新之后,项目接受新内容的创建。当创建新资源时,配额使用量立即增加。当一个资源被删除时,在下一次对项目的 quota 统计数据进行全面重新计算时,配额使用将减少。
ResourceQuota 应用于整个项目,但许多 OpenShift 过程,例如 build 和 deployment,在项目中创建 pod,可能会失败,因为启动它们将超过项目 quota。
如果对项目的修改超过了对象数量的 quota,则服务器将拒绝操作,并向用户返回错误消息。但如果修改超出了计算资源的quota,则操作不会立即失败。OpenShift 将重试该操作几次,使管理员有机会增加配额或执行纠正操作,比如上线新节点,扩容节点资源。
注意:如果设置了计算资源的 quota,OpenShift 拒绝创建不指定该计算资源的资源请求或资源限制的pod。
📑应用限制范围
LimitRange资源,也称为limit,定义了计算资源请求的默认值、最小值和最大值,以及项目中定义的单个pod或单个容器的限制,pod的资源请求或限制是其容器的总和。
要理解limit rang和resource quota之间的区别,limit rang为单个pod定义了有效范围和默认值,而resource quota仅为项目中所有pod的和定义了最高值。
通常可同时定义项目的限制和配额。
LimitRange资源还可以为image、is或pvc的存储容量定义默认值、最小值和最大值。如果添加到项目中的资源不提供计算资源请求,那么它将接受项目的limit ranges提供的默认值。如果新资源提供的计算资源请求或限制小于项目的limit range指定的最小值,则不创建该资源。同样,如果新资源提供的计算资源请求或限制高于项目的limit range所指定的最大值,则不会创建该资源。
OpenShift 提供的大多数标准 S2I builder image 和 templabe 都没有指定。要使用受配额限制的 template 和 builder,项目需要包含一个 limit range 对象,该对象为容器资源请求指定默认值。
如下为描述了一些可以由LimitRange对象指定的计算资源。
类型 | 资源名称 | 描述 |
---|---|---|
Container | cpu | 每个容器允许的最小和最大CPU。 |
Container | memory | 每个容器允许的最小和最大内存 |
Pod | cpu | 一个pod中所有容器允许的最小和最大CPU |
Pod | memory | 一个pod中所有容器允许的最小和最大内存 |
Image | storage | 可以推送到内部仓库的镜像的最大大小 |
PVC | storage | 一个pvc的容量的最小和最大容量 |
示例一:limit rang的yaml示例:
$ cat dev-limits.yml
apiVersion: "v1"
kind: "LimitRange"
metadata:
name: "dev-limits"
spec:
limits:
- type: "Pod"
max:
cpu: "2"
memory: "1Gi"
min:
cpu: "200m"
memory: "6Mi"
- type: "Container"
default:
cpu: "1"
memory: "512Mi"
$ oc create -f dev-limits.yml
$ oc describe limitranges NAME #查看项目中强制执行的限制约束
$ oc get limits #查看项目中强制执行的限制约束
$ oc delete limitranges name #按名称删除活动的限制范围
提示:OCP 3.9不支持使用oc create命令参数形式创建limit rang。
在项目中创建limit rang之后,将根据项目中的每个limit rang资源评估所有资源创建请求。如果新资源违反由任何limit rang设置的最小或最大约束,则拒绝该资源。如果新资源没有声明配置值,且约束支持默认值,则将默认值作为其使用值应用于新资源。
所有资源更新请求也将根据项目中的每个limit rang资源进行评估,如果更新后的资源违反了任何约束,则拒绝更新。
注意:避免将LimitRange设的过高,或ResourceQuota设的过低。违反LimitRange将阻止pod创建,并清晰保存。违反ResourceQuota将阻止pod被调度,状态转为pending。
📑多项目quota配额
ClusterResourceQuota资源是在集群级别创建的,创建方式类似持久卷,并指定应用于多个项目的资源约束。
可以通过以下两种方式指定哪些项目受集群资源配额限制:
- 使用openshift.io/requester标记,定义项目所有者,该所有者的所有项目均应用quota;
- 使用selector,匹配该selector的项目将应用quota。
示例1:
$ oc create clusterquota user-qa \
--project-annotation-selector openshift.io/requester=qa \
--hard pods=12 \
--hard secrets=20 #为qa用户拥有的所有项目创建集群资源配额
$ oc create clusterquota env-qa \
--project-label-selector environment=qa \
--hard pods=10 \
--hard services=5 #为所有具有environment=qa标签的项目创建集群资源配额
$ oc describe QUOTA NAME #查看应用于当前项目的集群资源配额
$ oc delete clusterquota NAME #删除集群资源配额
提示:不建议使用一个集群资源配额来匹配超过100个项目。这是为了避免较大的locking开销。当创建或更新项目中的资源时,在搜索所有适用的资源配额时锁定项目需要较大的资源消耗。
📜课本练习
📑前置准备
[student@workstation ~]$ lab install-prepare setup
[student@workstation ~]$ cd /home/student/do280-ansible
[student@workstation do280-ansible]$ ./install.sh
提示:若已经拥有一个完整环境,可不执行。
📑本练习准备
[student@workstation ~]$ lab monitor-limit setup
📑查看当前资源
[student@workstation ~]$ oc login -u admin -p redhat https://master.lab.example.com
[student@workstation ~]$ oc describe node node1.lab.example.com | grep -A 4 Allocated
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
300m (15%) 0 (0%) 2605306368 (32%) 8250M (101%)
[student@workstation ~]$ oc describe node node2.lab.example.com | grep -A 4 Allocated
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
100m (5%) 0 (0%) 256Mi (3%) 0 (0%)
📑创建应用
[student@workstation ~]$ oc new-project resources
[student@workstation ~]$ oc new-app --name=hello \
--docker-image=registry.lab.example.com/openshift/hello-openshift
[student@workstation ~]$ oc get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
hello-1-fl9x4 1/1 Running 0 15s 10.129.1.42 node2.lab.example.com
确保从该节点分配的资源没有更改。只检查运行hello pod的节点,使用上一步的输出:
[student@workstation ~]$ oc describe node node2.lab.example.com | grep -A 4 Allocated
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
100m (5%) 0 (0%) 256Mi (3%) 0 (0%)
📑删除应用
[student@workstation ~]$ oc delete all -l app=hello
deploymentconfig "hello" deleted
imagestream "hello" deleted
pod "hello-1-fl9x4" deleted
service "hello" deleted
📑添加资源限制
作为集群管理员,向项目添加quota和limit range,以便为项目中的pod提供默认资源请求。
[student@workstation ~]$ cat DO280/labs/monitor-limit/limits.yml
apiVersion: "v1"
kind: "LimitRange"
metadata:
name: "project-limits"
spec:
limits:
- type: "Container"
default:
cpu: "250m"
[student@workstation ~]$ oc create -f DO280/labs/monitor-limit/limits.yml
limitrange "project-limits" created
[student@workstation ~]$ oc describe limitrange
Name: project-limits
Namespace: resources
Type Resource Min Max Default Request Default Limit Max Limit/Request Ratio
---- -------- --- --- --------------- ------------- -----------------------
Container cpu - - 250m 250m
[student@workstation ~]$ cat DO280/labs/monitor-limit/quota.yml
apiVersion: v1
kind: ResourceQuota
metadata:
name: project-quota
spec:
hard:
cpu: "900m"
[student@workstation ~]$ oc create -f DO280/labs/monitor-limit/quota.yml
resourcequota "project-quota" created
[student@workstation ~]$ oc describe quota
Name: project-quota
Namespace: resources
Resource Used Hard
-------- ---- ----
cpu 0 900m
📑授权项目
[student@workstation ~]$ oc adm policy add-role-to-user edit developer
role "edit" added: "developer"
📑验证资源限制
[student@workstation ~]$ oc login -u developer -p redhat https://master.lab.example.com
[student@workstation ~]$ oc project resources
Already on project "resources" on server "https://master.lab.example.com:443".
[student@workstation ~]$ oc get limits
NAME AGE
project-limits 10m
[student@workstation ~]$ oc delete limits project-limits
#验证限制是否有效,但developer用户不能删除该限制
Error from server (Forbidden): limitranges "project-limits" is forbidden: User "developer" cannot delete limitranges in the namespace "resources": User "developer" cannot delete limitranges in project "resources"
[student@workstation ~]$ oc get quota
NAME AGE
project-quota 10m
[student@workstation ~]$ oc delete quota project-quota
Error from server (Forbidden): resourcequotas "project-quota" is forbidden: User "developer" cannot delete resourcequotas in the namespace "resources": User "developer" cannot delete resourcequotas in project "resources"
📑创建应用
[student@workstation ~]$ oc new-app --name=hello \
--docker-image=registry.lab.example.com/openshift/hello-openshift
[student@workstation ~]$ oc get pod
NAME READY STATUS RESTARTS AGE
hello-1-deploy 1/1 Running 0 4s
📑查看quota
[student@workstation ~]$ oc describe quota
Name: project-quota
Namespace: resources
Resource Used Hard
-------- ---- ----
cpu 250m 900m
📑查看节点可用资源
[student@workstation ~]$ oc login -u admin -p redhat
[student@workstation ~]$ oc get pod -o wide -n resources
NAME READY STATUS RESTARTS AGE IP NODE
hello-1-8hfdd 1/1 Running 0 1m 10.129.1.44 node2.lab.example.com
[student@workstation ~]$ oc describe node node2.lab.example.com | grep -A 4 Allocated
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
350m (17%) 250m (12%) 256Mi (3%) 0 (0%)
[student@workstation ~]$ oc describe pod hello-1-8hfdd | grep -A 2 Requests
Requests:
cpu: 250m
Environment: <none>
📑扩容应用
[student@workstation ~]$ oc scale dc hello --replicas=2 #扩容应用
[student@workstation ~]$ oc get pod #查看扩容后的pod
NAME READY STATUS RESTARTS AGE
hello-1-8hfdd 1/1 Running 0 5m
hello-1-n57bg 1/1 Running 0 5s
[student@workstation ~]$ oc describe quota #查看扩容后的quota情况
Name: project-quota
Namespace: resources
Resource Used Hard
-------- ---- ----
cpu 500m 900m
[student@workstation ~]$ oc scale dc hello --replicas=4 #继续扩容至4个
[student@workstation ~]$ oc get pod #查看扩容的pod
NAME READY STATUS RESTARTS AGE
hello-1-2g85l 1/1 Running 0 5s
hello-1-8hfdd 1/1 Running 0 6m
hello-1-n57bg 1/1 Running 0 1m
[student@workstation ~]$ oc describe dc hello | grep Replicas #查看replaces情况
Replicas: 4
Replicas: 3 current / 4 desired
[student@workstation ~]$ oc get events | grep -i error
1m 1m 1 hello-1.166d5116f4df266c ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-ndjw7" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d5116f65c6908 ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-gm44g" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d5116f746f1ed ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-2kr2d" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d5116f7e591e0 ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-l9t5w" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d5116fa9e8169 ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-9hv58" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d5116ffe7f28e ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-ldp5d" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d511709c538bd ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-qbklb" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d51171d434fb1 ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-m9th2" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
1m 1m 1 hello-1.166d511743c4b981 ReplicationController Warning FailedCreate replication-controller Error creating: pods "hello-1-qkmlm" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
41s 1m 6 hello-1.166d5117907668dd ReplicationController Warning FailedCreate replication-controller (combined from similar events): Error creating: pods "hello-1-9t64x" is forbidden: exceeded quota: project-quota, requested: cpu=250m, used: cpu=750m, limited: cpu=900m
结论:由于超过了配额规定,会提示控制器无法创建第四个pod。
📑添加配额请求
[student@workstation ~]$ oc scale dc hello --replicas=1
[student@workstation ~]$ oc get pod
NAME READY STATUS RESTARTS AGE
hello-1-8hfdd 1/1 Running 0 28m
hello-1-n57bg 0/1 Terminating 0 23m
[student@workstation ~]$ oc get pod
NAME READY STATUS RESTARTS AGE
hello-1-8hfdd 1/1 Running 0 28m
[student@workstation ~]$ oc set resources dc hello --requests=memory=256Mi #设置资源请求
[student@workstation ~]$ oc get pod
NAME READY STATUS RESTARTS AGE
hello-1-8hfdd 0/1 Terminating 0 36m
hello-2-nd52c 1/1 Running 0 10s
[student@workstation ~]$ oc get pod
NAME READY STATUS RESTARTS AGE
hello-2-nd52c 1/1 Running 0 13s
[student@workstation ~]$ oc describe pod hello-2-nd52c | grep -A 3 Requests
Requests:
cpu: 250m
memory: 256Mi
Environment: <none>
[student@workstation ~]$ oc describe quota #查看quota
Name: project-quota
Namespace: resources
Resource Used Hard
-------- ---- ----
cpu 250m 900m
结论:由上可知从项目的配额角度来看,没有什么变化
📑增大配额请求
[student@workstation ~]$ oc set resources dc hello --requests=memory=8Gi
deploymentconfig "hello" resource requirements updated
[student@workstation ~]$ oc get pod
NAME READY STATUS RESTARTS AGE
hello-2-nd52c 1/1 Running 0 13m
hello-3-5xnmp 0/1 Pending 0 9s
hello-3-deploy 1/1 Running 0 14s
[student@workstation ~]$ oc logs hello-3-deploy
--> Scaling up hello-3 from 0 to 1, scaling down hello-2 from 1 to 0 (keep 1 pods available, don't exceed 2 pods)
Scaling hello-3 up to 1
[student@workstation ~]$ oc status
In project resources on server https://master.lab.example.com:443
svc/hello - 172.30.192.95 ports 8080, 8888
dc/hello deploys istag/hello:latest
deployment #3 running for about a minute - 0/1 pods
deployment #2 deployed 14 minutes ago - 1 pod
deployment #1 deployed about an hour ago
3 infos identified, use 'oc status -v' to see details.
[student@workstation ~]$ oc get events | grep hello-3.
3s 3m 15 hello-3-5xnmp.166d5367c0dab13a Pod Warning FailedScheduling default-scheduler 0/3 nodes are available: 1 MatchNodeSelector, 3 Insufficient memory.
结论:由于资源请求超过node最大值,最终显示一个警告,说明由于内存不足,无法将pod调度到任何节点。
📑清除项目
[student@workstation ~]$ oc login -u admin -p redhat
[student@workstation ~]$ oc delete project resources
project "resources" deleted
💡总结
RHCA认证需要经历5门的学习与考试,还是需要花不少时间去学习与备考的,好好加油,可以噶🤪。
以上就是【金鱼哥】对 第九章 管理和监控OpenShift平台--资源限制 的简述和讲解。希望能对看到此文章的小伙伴有所帮助。
💾 红帽认证专栏系列:
RHCSA专栏: 戏说 RHCSA 认证
RHCE专栏: 戏说 RHCE 认证
此文章收录在RHCA专栏: RHCA 回忆录
如果这篇【文章】有帮助到你,希望可以给【金鱼哥】点个赞👍,创作不易,相比官方的陈述,我更喜欢用【通俗易懂】的文笔去讲解每一个知识点。
如果有对【运维技术】感兴趣,也欢迎关注❤️❤️❤️ 【金鱼哥】❤️❤️❤️,我将会给你带来巨大的【收获与惊喜】💕💕!