云原生|kubernetes|找回丢失的etcd集群节点---etcd节点重新添加,扩容和重新初始化k8s的master节点

本文涉及的产品
日志服务 SLS,月写入数据量 50GB 1个月
容器服务 Serverless 版 ACK Serverless,317元额度 多规格
容器服务 Serverless 版 ACK Serverless,952元额度 多规格
简介: 云原生|kubernetes|找回丢失的etcd集群节点---etcd节点重新添加,扩容和重新初始化k8s的master节点

前言:

VMware安装的四台虚拟机,IP分配为:192.168.217.19/20/21/22 ,采用kubeadm部署的高可用kubernetes集群,该集群使用的是外部扩展etcd集群,etcd集群部署在19 20 21,master也是19 20 21,22为工作节点。

具体的配置和安装操作见上一篇文章:云原生|kubernetes|kubeadm部署高可用集群(二)---kube-apiserver高可用+etcd外部集群+haproxy+keepalived_晚风_END的博客-CSDN博客

由于误操作,当然,也不是误操作,只是ionice没用好,19服务器彻底挂了,虽然有打快照,但不想恢复快照,看看如何找回这个失去的master节点。

在20服务器上查看:

[root@master2 ~]# kubectl get po -A -owide
NAMESPACE     NAME                                       READY   STATUS      RESTARTS       
kube-system   calico-kube-controllers-796cc7f49d-5fz47   1/1     Running     3 (88m ago)    2d5h    10.244.166.146   node1     <none>           <none>
kube-system   calico-node-49qd6                          1/1     Running     4 (29h ago)    2d5h    192.168.217.19   master1   <none>           <none>
kube-system   calico-node-l9kbj                          1/1     Running     3 (88m ago)    2d5h    192.168.217.21   master3   <none>           <none>
kube-system   calico-node-nsknc                          1/1     Running     3 (88m ago)    2d5h    192.168.217.20   master2   <none>           <none>
kube-system   calico-node-pd8v2                          1/1     Running     6 (88m ago)    2d5h    192.168.217.22   node1     <none>           <none>
kube-system   coredns-7f6cbbb7b8-7c85v                   1/1     Running     15 (88m ago)   5d16h   10.244.166.143   node1     <none>           <none>
kube-system   coredns-7f6cbbb7b8-h9wtb                   1/1     Running     15 (88m ago)   5d16h   10.244.166.144   node1     <none>           <none>
kube-system   kube-apiserver-master1                     1/1     Running     19 (29h ago)   5d18h   192.168.217.19   master1   <none>           <none>
kube-system   kube-apiserver-master2                     1/1     Running     1 (88m ago)    16h     192.168.217.20   master2   <none>           <none>
kube-system   kube-apiserver-master3                     1/1     Running     1 (88m ago)    16h     192.168.217.21   master3   <none>           <none>
kube-system   kube-controller-manager-master1            1/1     Running     14             4d2h    192.168.217.19   master1   <none>           <none>
kube-system   kube-controller-manager-master2            1/1     Running     12 (88m ago)   4d4h    192.168.217.20   master2   <none>           <none>
kube-system   kube-controller-manager-master3            1/1     Running     13 (88m ago)   4d4h    192.168.217.21   master3   <none>           <none>
kube-system   kube-proxy-69w6c                           1/1     Running     2 (29h ago)    2d5h    192.168.217.19   master1   <none>           <none>
kube-system   kube-proxy-vtz99                           1/1     Running     4 (88m ago)    2d6h    192.168.217.22   node1     <none>           <none>
kube-system   kube-proxy-wldcc                           1/1     Running     4 (88m ago)    2d6h    192.168.217.21   master3   <none>           <none>
kube-system   kube-proxy-x6w6l                           1/1     Running     4 (88m ago)    2d6h    192.168.217.20   master2   <none>           <none>
kube-system   kube-scheduler-master1                     1/1     Running     11 (29h ago)   4d2h    192.168.217.19   master1   <none>           <none>
kube-system   kube-scheduler-master2                     1/1     Running     11 (88m ago)   4d4h    192.168.217.20   master2   <none>           <none>
kube-system   kube-scheduler-master3                     1/1     Running     10 (88m ago)   4d4h    192.168.217.21   master3   <none>           <none>
kube-system   metrics-server-55b9b69769-j9c7j            1/1     Running     1 (88m ago)    16h     10.244.166.147   node1     <none>           <none>

查看etcd状态:

这里是设置带所有证书的etcdctl命令的别名(在20服务器上):

alias etct_search="ETCDCTL_API=3 \
/opt/etcd/bin/etcdctl \
--endpoints=https://192.168.217.19:2379,https://192.168.217.20:2379,https://192.168.217.21:2379 \
--cacert=/opt/etcd/ssl/ca.pem \
--cert=/opt/etcd/ssl/server.pem \
--key=/opt/etcd/ssl/server-key.pem"

已经将下线的etcd节点19踢出member,查看集群状态可以看到有报错:

[root@master2 ~]# etct_serch member list -w table
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
|        ID        | STATUS  |  NAME  |         PEER ADDRS          |        CLIENT ADDRS         | IS LEARNER |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
| ef2fee107aafca91 | started | etcd-2 | https://192.168.217.20:2380 | https://192.168.217.20:2379 |      false |
| f5b8cb45a0dcf520 | started | etcd-3 | https://192.168.217.21:2380 | https://192.168.217.21:2379 |      false |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
[root@master2 ~]# etct_serch endpoint status -w table
{"level":"warn","ts":"2022-11-01T17:13:39.004+0800","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.217.19:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: connection error: desc = \"transport: Error while dialing dial tcp 192.168.217.19:2379: connect: no route to host\""}
Failed to get the status of endpoint https://192.168.217.19:2379 (context deadline exceeded)
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.217.20:2379 | ef2fee107aafca91 |   3.4.9 |  4.0 MB |      true |      false |       229 |     453336 |             453336 |        |
| https://192.168.217.21:2379 | f5b8cb45a0dcf520 |   3.4.9 |  4.0 MB |     false |      false |       229 |     453336 |             453336 |        |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

etcd的配置文件(20服务器正常的配置文件):

[root@master2 ~]# cat  /opt/etcd/cfg/etcd.conf 
#[Member]
ETCD_NAME="etcd-2"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.217.20:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.217.20:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.217.20:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.217.20:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.217.19:2380,etcd-2=https://192.168.217.20:2380,etcd-3=https://192.168.217.21:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

etcd的启动脚本(20服务器上正常的启动脚本):

[root@master2 ~]# cat /usr/lib/systemd/system/etcd.service 
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
        --cert-file=/opt/etcd/ssl/server.pem \
        --key-file=/opt/etcd/ssl/server-key.pem \
        --peer-cert-file=/opt/etcd/ssl/server.pem \
        --peer-key-file=/opt/etcd/ssl/server-key.pem \
        --trusted-ca-file=/opt/etcd/ssl/ca.pem \
        --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
        --wal-dir=/var/lib/etcd \ #快照日志路径
        --snapshot-count=50000 \ #最大快照次数,指定有多少事务被提交时,触发截取快照保存到磁盘,释放wal日志,默认值100000
        --auto-compaction-retention=1 \  #首次压缩周期为1小时,后续压缩周期为当前值的10%,也就是每隔6分钟压缩一次
        --auto-compaction-mode=periodic \ #周期性压缩
        --max-request-bytes=$((10*1024*1024)) \ #请求的最大字节数,默认一个key为1.5M,官方推荐最大为10M
        --quota-backend-bytes=$((8*1024*1024*1024)) \
        --heartbeat-interval="500" \
        --election-timeout="1000"
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target

一,

恢复计划

根据以上的信息,可以看出,etcd集群需要恢复到三个,原节点19重做虚拟机,使用原IP,这样可以不需要重新制作etcd证书, 只需要加入现有的etcd集群即可。

因为IP没有改变,并且etcd是外部集群,因此第二步就是待etcd集群恢复后,只需要重新kubeadm reset各个节点后,所有节点在重新加入一次整个集群就恢复了。

二,

etcd集群的恢复

1,踢出损坏的节点

踢出命令为(数字为 etcdctl member list 查询出的ID):

etct_serch member remove 97c1c1003e0d4bf

2,

将正常的etcd节点的配置文件,程序拷贝到19节点(/opt/etcd目录下有etcd集群证书和两个可执行文件以及主配置文件,配置文件一哈要修改):

scp -r /opt/etcd  192.168.217.19:/opt/
scp /usr/lib/systemd/system/etcd.service 192.168.217.19:/usr/lib/systemd/system/

3,

修改19的etcd配置文件:

主要是IP修改和name修改,以及cluster_state 修改为existing可以对比上面的配置文件

[root@master ~]# cat !$
cat /opt/etcd/cfg/etcd.conf
#[Member]
ETCD_NAME="etcd-1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.217.19:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.217.19:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.217.19:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.217.19:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.217.19:2380,etcd-2=https://192.168.217.20:2380,etcd-3=https://192.168.217.21:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="existing"

4,

在20服务器上执行添加member命令,这个非常重要!!!!!!!!!!!!

etct_serch member add etcd-1 --peer-urls=https://192.168.217.19:2380

5,

三个节点的etcd服务都重启:

systemctl daemon-reload  && systemctl restart etcd

6,etcd集群恢复的日志

查看20服务器上的系统日志可以看到etcd新加节点以及选主的过程,现将相关日志截取出来(3d那一串就是新节点,也就是192.168.217.19):

Nov  1 22:41:47 master2 etcd: raft2022/11/01 22:41:47 INFO: ef2fee107aafca91 switched to configuration voters=(4427268366965300623 17235256053515405969 17706125434919122208)
Nov  1 22:41:47 master2 etcd: added member 3d70d11f824a5d8f [https://192.168.217.19:2380] to cluster b459890bbabfc99f
Nov  1 22:41:47 master2 etcd: starting peer 3d70d11f824a5d8f...
Nov  1 22:41:47 master2 etcd: started HTTP pipelining with peer 3d70d11f824a5d8f
Nov  1 22:41:47 master2 etcd: started streaming with peer 3d70d11f824a5d8f (writer)
Nov  1 22:41:47 master2 etcd: started streaming with peer 3d70d11f824a5d8f (writer)
Nov  1 22:41:47 master2 etcd: started peer 3d70d11f824a5d8f
Nov  1 22:41:47 master2 etcd: added peer 3d70d11f824a5d8f
Nov  1 22:41:47 master2 etcd: started streaming with peer 3d70d11f824a5d8f (stream Message reader)
Nov  1 22:41:47 master2 etcd: started streaming with peer 3d70d11f824a5d8f (stream MsgApp v2 reader)
Nov  1 22:41:47 master2 etcd: peer 3d70d11f824a5d8f became active
Nov  1 22:41:47 master2 etcd: established a TCP streaming connection with peer 3d70d11f824a5d8f (stream Message reader)
Nov  1 22:41:47 master2 etcd: established a TCP streaming connection with peer 3d70d11f824a5d8f (stream MsgApp v2 reader)
Nov  1 22:41:47 master2 etcd: ef2fee107aafca91 initialized peer connection; fast-forwarding 8 ticks (election ticks 10) with 2 active peer(s)
Nov  1 22:41:47 master2 etcd: established a TCP streaming connection with peer 3d70d11f824a5d8f (stream Message writer)
Nov  1 22:41:47 master2 etcd: established a TCP streaming connection with peer 3d70d11f824a5d8f (stream MsgApp v2 writer)
Nov  1 22:41:47 master2 etcd: published {Name:etcd-2 ClientURLs:[https://192.168.217.20:2379]} to cluster b459890bbabfc99f
Nov  1 22:41:47 master2 etcd: ready to serve client requests
Nov  1 22:41:47 master2 systemd: Started Etcd Server.
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 [term 639] received MsgTimeoutNow from 3d70d11f824a5d8f and starts an election to get leadership.
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 became candidate at term 640
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 received MsgVoteResp from ef2fee107aafca91 at term 640
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 [logterm: 639, index: 492331] sent MsgVote request to 3d70d11f824a5d8f at term 640
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 [logterm: 639, index: 492331] sent MsgVote request to f5b8cb45a0dcf520 at term 640
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: raft.node: ef2fee107aafca91 lost leader 3d70d11f824a5d8f at term 640
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 received MsgVoteResp from 3d70d11f824a5d8f at term 640
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 has received 2 MsgVoteResp votes and 0 vote rejections
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: ef2fee107aafca91 became leader at term 640
Nov  1 22:42:26 master2 etcd: raft2022/11/01 22:42:26 INFO: raft.node: ef2fee107aafca91 elected leader ef2fee107aafca91 at term 640
Nov  1 22:42:26 master2 etcd: lost the TCP streaming connection with peer 3d70d11f824a5d8f (stream MsgApp v2 reader)
Nov  1 22:42:26 master2 etcd: lost the TCP streaming connection with peer 3d70d11f824a5d8f (stream Message reader)

 

6,

etcd集群测试

可以看到现在的20服务器是etcd的leader,这一点也在上面的日志上有所体现。

[root@master2 ~]# etct_serch member list -w table
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
|        ID        | STATUS  |  NAME  |         PEER ADDRS          |        CLIENT ADDRS         | IS LEARNER |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
| 3d70d11f824a5d8f | started | etcd-1 | https://192.168.217.19:2380 | https://192.168.217.19:2379 |      false |
| ef2fee107aafca91 | started | etcd-2 | https://192.168.217.20:2380 | https://192.168.217.20:2379 |      false |
| f5b8cb45a0dcf520 | started | etcd-3 | https://192.168.217.21:2380 | https://192.168.217.21:2379 |      false |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
[root@master2 ~]# etct_serch endpoint status  -w table
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.217.19:2379 | 3d70d11f824a5d8f |   3.4.9 |  4.0 MB |     false |      false |       640 |     494999 |             494999 |        |
| https://192.168.217.20:2379 | ef2fee107aafca91 |   3.4.9 |  4.1 MB |      true |      false |       640 |     495000 |             495000 |        |
| https://192.168.217.21:2379 | f5b8cb45a0dcf520 |   3.4.9 |  4.0 MB |     false |      false |       640 |     495000 |             495000 |        |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@master2 ~]# etct_serch endpoint health  -w table
+-----------------------------+--------+-------------+-------+
|          ENDPOINT           | HEALTH |    TOOK     | ERROR |
+-----------------------------+--------+-------------+-------+
| https://192.168.217.19:2379 |   true | 26.210272ms |       |
| https://192.168.217.20:2379 |   true | 26.710558ms |       |
| https://192.168.217.21:2379 |   true | 27.903774ms |       |
+-----------------------------+--------+-------------+-------+

三,

kubernetes节点的恢复:

由于是三个master里的master1挂了,因此,需要重新安装haproxy和keepalived,这个没什么好说的,从正常节点20,把配置文件拷贝一份,在改改就可以了。

同样的,将kubelet,kubeadm和kubectl也重新安装一次。

然后四个节点都重置,也就是kubeadm reset -f   然后重新初始化即可,初始化文件(节点恢复这里本文最开始有挂部署安装教程的链接,在这就不重复了):

[root@master ~]# cat kubeadm-init-ha.yaml 
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: "0"
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.217.19
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  imagePullPolicy: IfNotPresent
  name: master1
  taints: null
---
controlPlaneEndpoint: "192.168.217.100"
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  external:
    endpoints:     #下面为自定义etcd集群地址
    - https://192.168.217.19:2379
    - https://192.168.217.20:2379
    - https://192.168.217.21:2379
    caFile: /etc/kubernetes/pki/etcd/ca.pem
    certFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem
    keyFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.22.2
networking:
  dnsDomain: cluster.local
  podSubnet: "10.244.0.0/16"
  serviceSubnet: "10.96.0.0/12"
scheduler: {}

初始化命令(在19服务器上执行,因为初始化配置文件写的就是19服务器嘛):

kubeadm init --config=kubeadm-init-ha.yaml --upload-certs

由于使用的是外部扩展etcd集群,因此,节点恢复起来比较简单,网络插件什么的重新安装一次就可以了,效果如下图:

可以看到部分pod,比如kube-apiserver的时间都刷新了,像kube-proxy等pod并没有改变时间,这都是由于使用的是外部etcd集群的原因。

[root@master ~]# kubectl get po -A -owide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS        AGE     IP               NODE      NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-796cc7f49d-k586w   1/1     Running   0               153m    10.244.166.131   node1     <none>           <none>
kube-system   calico-node-7x86d                          1/1     Running   0               153m    192.168.217.21   master3   <none>           <none>
kube-system   calico-node-dhxcq                          1/1     Running   0               153m    192.168.217.19   master1   <none>           <none>
kube-system   calico-node-jcq6p                          1/1     Running   0               153m    192.168.217.20   master2   <none>           <none>
kube-system   calico-node-vjtv6                          1/1     Running   0               153m    192.168.217.22   node1     <none>           <none>
kube-system   coredns-7f6cbbb7b8-7c85v                   1/1     Running   16              5d23h   10.244.166.129   node1     <none>           <none>
kube-system   coredns-7f6cbbb7b8-7xm62                   1/1     Running   0               152m    10.244.166.132   node1     <none>           <none>
kube-system   kube-apiserver-master1                     1/1     Running   0               107m    192.168.217.19   master1   <none>           <none>
kube-system   kube-apiserver-master2                     1/1     Running   0               108m    192.168.217.20   master2   <none>           <none>
kube-system   kube-apiserver-master3                     1/1     Running   1 (107m ago)    107m    192.168.217.21   master3   <none>           <none>
kube-system   kube-controller-manager-master1            1/1     Running   5 (108m ago)    3h26m   192.168.217.19   master1   <none>           <none>
kube-system   kube-controller-manager-master2            1/1     Running   15 (108m ago)   4d10h   192.168.217.20   master2   <none>           <none>
kube-system   kube-controller-manager-master3            1/1     Running   15              4d10h   192.168.217.21   master3   <none>           <none>
kube-system   kube-proxy-69w6c                           1/1     Running   2 (131m ago)    2d12h   192.168.217.19   master1   <none>           <none>
kube-system   kube-proxy-vtz99                           1/1     Running   5               2d12h   192.168.217.22   node1     <none>           <none>
kube-system   kube-proxy-wldcc                           1/1     Running   5               2d12h   192.168.217.21   master3   <none>           <none>
kube-system   kube-proxy-x6w6l                           1/1     Running   5               2d12h   192.168.217.20   master2   <none>           <none>
kube-system   kube-scheduler-master1                     1/1     Running   4 (108m ago)    3h26m   192.168.217.19   master1   <none>           <none>
kube-system   kube-scheduler-master2                     1/1     Running   14 (108m ago)   4d10h   192.168.217.20   master2   <none>           <none>
kube-system   kube-scheduler-master3                     1/1     Running   13 (46m ago)    4d10h   192.168.217.21   master3   <none>           <none>
kube-system   metrics-server-55b9b69769-j9c7j            1/1     Running   13 (127m ago)   23h     10.244.166.130   node1     <none>           <none>


相关实践学习
通过Ingress进行灰度发布
本场景您将运行一个简单的应用,部署一个新的应用用于新的发布,并通过Ingress能力实现灰度发布。
容器应用与集群管理
欢迎来到《容器应用与集群管理》课程,本课程是“云原生容器Clouder认证“系列中的第二阶段。课程将向您介绍与容器集群相关的概念和技术,这些概念和技术可以帮助您了解阿里云容器服务ACK/ACK Serverless的使用。同时,本课程也会向您介绍可以采取的工具、方法和可操作步骤,以帮助您了解如何基于容器服务ACK Serverless构建和管理企业级应用。 学习完本课程后,您将能够: 掌握容器集群、容器编排的基本概念 掌握Kubernetes的基础概念及核心思想 掌握阿里云容器服务ACK/ACK Serverless概念及使用方法 基于容器服务ACK Serverless搭建和管理企业级网站应用
目录
相关文章
|
22天前
|
JSON Kubernetes 容灾
ACK One应用分发上线:高效管理多集群应用
ACK One应用分发上线,主要介绍了新能力的使用场景
|
23天前
|
Kubernetes 持续交付 开发工具
ACK One GitOps:ApplicationSet UI简化多集群GitOps应用管理
ACK One GitOps新发布了多集群应用控制台,支持管理Argo CD ApplicationSet,提升大规模应用和集群的多集群GitOps应用分发管理体验。
|
1月前
|
Kubernetes Ubuntu Linux
Centos7 搭建 kubernetes集群
本文介绍了如何搭建一个三节点的Kubernetes集群,包括一个主节点和两个工作节点。各节点运行CentOS 7系统,最低配置为2核CPU、2GB内存和15GB硬盘。详细步骤包括环境配置、安装Docker、关闭防火墙和SELinux、禁用交换分区、安装kubeadm、kubelet、kubectl,以及初始化Kubernetes集群和安装网络插件Calico或Flannel。
140 0
|
3天前
|
Cloud Native 安全 数据安全/隐私保护
云原生架构下的微服务治理与挑战####
随着云计算技术的飞速发展,云原生架构以其高效、灵活、可扩展的特性成为现代企业IT架构的首选。本文聚焦于云原生环境下的微服务治理问题,探讨其在促进业务敏捷性的同时所面临的挑战及应对策略。通过分析微服务拆分、服务间通信、故障隔离与恢复等关键环节,本文旨在为读者提供一个关于如何在云原生环境中有效实施微服务治理的全面视角,助力企业在数字化转型的道路上稳健前行。 ####
|
5天前
|
运维 Kubernetes Cloud Native
云原生技术:容器化与微服务架构的完美结合
【10月更文挑战第37天】在数字化转型的浪潮中,云原生技术以其灵活性和高效性成为企业的新宠。本文将深入探讨云原生的核心概念,包括容器化技术和微服务架构,以及它们如何共同推动现代应用的发展。我们将通过实际代码示例,展示如何在Kubernetes集群上部署一个简单的微服务,揭示云原生技术的强大能力和未来潜力。
|
14天前
|
弹性计算 Kubernetes Cloud Native
云原生架构下的微服务设计原则与实践####
本文深入探讨了在云原生环境中,微服务架构的设计原则、关键技术及实践案例。通过剖析传统单体架构面临的挑战,引出微服务作为解决方案的优势,并详细阐述了微服务设计的几大核心原则:单一职责、独立部署、弹性伸缩和服务自治。文章还介绍了容器化技术、Kubernetes等云原生工具如何助力微服务的高效实施,并通过一个实际项目案例,展示了从服务拆分到持续集成/持续部署(CI/CD)流程的完整实现路径,为读者提供了宝贵的实践经验和启发。 ####
|
6天前
|
消息中间件 存储 Cloud Native
云原生架构下的数据一致性挑战与应对策略####
本文探讨了在云原生环境中,面对微服务架构的广泛应用,数据一致性问题成为系统设计的核心挑战之一。通过分析云原生环境的特点,阐述了数据不一致性的常见场景及其对业务的影响,并深入讨论了解决这些问题的策略,包括采用分布式事务、事件驱动架构、补偿机制以及利用云平台提供的托管服务等。文章旨在为开发者提供一套系统性的解决方案框架,以应对在动态、分布式的云原生应用中保持数据一致性的复杂性。 ####
|
3天前
|
Cloud Native 安全 API
云原生架构下的微服务治理策略与实践####
—透过云原生的棱镜,探索微服务架构下的挑战与应对之道 本文旨在探讨云原生环境下,微服务架构所面临的关键挑战及有效的治理策略。随着云计算技术的深入发展,越来越多的企业选择采用云原生架构来构建和部署其应用程序,以期获得更高的灵活性、可扩展性和效率。然而,微服务架构的复杂性也带来了服务发现、负载均衡、故障恢复等一系列治理难题。本文将深入分析这些问题,并提出一套基于云原生技术栈的微服务治理框架,包括服务网格的应用、API网关的集成、以及动态配置管理等关键方面,旨在为企业实现高效、稳定的微服务架构提供参考路径。 ####
20 5
|
4天前
|
Kubernetes 负载均衡 Cloud Native
云原生架构下的微服务治理策略
随着云原生技术的不断成熟,微服务架构已成为现代应用开发的主流选择。本文探讨了在云原生环境下实施微服务治理的策略和方法,重点分析了服务发现、负载均衡、故障恢复和配置管理等关键技术点,以及如何利用Kubernetes等容器编排工具来优化微服务的部署和管理。文章旨在为开发者提供一套实用的微服务治理框架,帮助其在复杂的云环境中构建高效、可靠的分布式系统。
18 5
|
4天前
|
负载均衡 监控 Cloud Native
云原生架构下的微服务治理策略与实践####
在数字化转型浪潮中,企业纷纷拥抱云计算,而云原生架构作为其核心技术支撑,正引领着一场深刻的技术变革。本文聚焦于云原生环境下微服务架构的治理策略与实践,探讨如何通过精细化的服务管理、动态的流量调度、高效的故障恢复机制以及持续的监控优化,构建弹性、可靠且易于维护的分布式系统。我们将深入剖析微服务治理的核心要素,结合具体案例,揭示其在提升系统稳定性、扩展性和敏捷性方面的关键作用,为读者提供一套切实可行的云原生微服务治理指南。 ####