1 docker环境检查及升级
PolarDB-X部署要求docker版本不低于1.19.3,检查现在docker版本
[root@my_ob ~]# docker version
Client:
Version: 1.13.1
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: docker-1.13.1-209.git7d71120.el7.centos.x86_64
Go version: go1.10.3
Git commit: 7d71120/1.13.1
Built: Wed Mar 2 15:25:43 2022
OS/Arch: linux/amd64
Experimental: false
现在的版本是1.13.1,需要升级到1.19.3以上版本,docker的升级实际上就是删除旧的版本,安装新的版本。运行下面的命令删除旧版本
[root@my_ob run]# yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
使用yum安装,先添加docker-ce yum源,
[root@my_ob run]#yum-config-manager \
--add-repo \
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
Loaded plugins: fastestmirror, langpacks, product-id, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
adding repo from:
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
grabbing file
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
to /etc/yum.repos.d/docker-ce.repo
repo saved to /etc/yum.repos.d/docker-ce.repo
yum源配置好了,可以安装docker了,运行下面的命令
[root@my_ob run]# sudo yum -y install docker-ce docker-ce-cli containerd.io
Installed:
containerd.io.x86_64 0:1.6.6-3.1.el7 docker-ce.x86_64 3:20.10.17-3.el7 docker-ce-cli.x86_64 1:20.10.17-3.el7
Dependency Installed:
docker-ce-rootless-extras.x86_64 0:20.10.17-3.el7 docker-scan-plugin.x86_64 0:0.17.0-3.el7
安装成功后,检查一下docker版本
[root@my_ob run]# docker -v
Docker version 20.10.17, build 100c701
dockers已经升级至最高版本了。
2 安装helm
先下载helm二机制包
[root@my_ob ~]# wget
https://get.helm.sh/helm-v3.7.1-linux-amd64.tar.gz
--2022-05-31 15:02:18--
https://get.helm.sh/helm-v3.7.1-linux-amd64.tar.gz
Resolving get.helm.sh (get.helm.sh)... 2606:2800:247:1cb7:261b:1f9c:2074:3c, 152.199.39.108
Connecting to get.helm.sh (get.helm.sh)|2606:2800:247:1cb7:261b:1f9c:2074:3c|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13862382 (13M) [application/x-tar]
Saving to: ‘helm-v3.7.1-linux-amd64.tar.gz’
100%[==============================================================================>] 13,862,382 379KB/s in 38s
2022-05-31 15:02:59 (358 KB/s) - ‘helm-v3.7.1-linux-amd64.tar.gz’ saved [13862382/13862382]
解压后进入解压后的目录,将helm文件移到/usr/local/bin目录下
[root@my_ob ~]# tar -xvf helm-v3.7.1-linux-amd64.tar.gz -C /usr/local
[root@my_ob ~]# cd /usr/local
[root@my_ob local]# cd linux-amd64
[root@my_ob linux-amd64]# ls
[root@my_ob linux-amd64]# mv helm ../bin
查看helm的版本
[root@my_ob local]# helm version
version.BuildInfo{Version:"v3.7.1", GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4", GitTreeState:"clean", GoVersion:"go1.16.9"}
3 安装minikube,创建k8s集群
minikube要求使用非root用户部署,创建一个新用户,名字为
[root@my_ob ~]# useradd -ms /bin/bash galaxykube
创建的用户要求能够通过docker.sock连接到docker,因此要求属于docker.sock所在的组,docker启动后,这个文件可以在/var/run目录下找到
[root@my_ob ~]# cd /var/run
[root@my_ob run]# ls -l docke*
-rw-r--r-- 1 root root 4 Jun 9 15:55 docker.pid
srw-rw---- 1 root docker 0 Jun 9 15:55 docker.sock
可以看到这个文件的属组是docker,将创建的新用户加入此组
[root@my_ob ~]# usermod -aG docker galaxykube
下载minikube二进制文件,并将其移到/usr/local/bin目录下
[root@my_ob ~]# curl -Lo minikube
https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
&& chmod +x minikube && mv minikube /usr/local/bin/
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 69.2M 100 69.2M 0 0 3018k 0 0:00:23 0:00:23 --:--:-- 3732k
上面这条命令先现在minikube二进制文件,赋予其执行权限后,将其移动到/usr/local/bin目录下。命令执行完成后,看一下minikube版本
[root@my_ob ~]# minikube version
minikube version: v1.25.2
commit: 362d5fdc0a3dbee389b3d3f1034e8023e72bd3a7
现在可以启动一个k8s集群了,首先要切换到新创建的用户下
[root@my_ob ~]# su - galaxykube
使用参考文档中命令启动k8s集群
minikube start --cpus 4 --memory 10240 --image-mirror-country cn --registry-mirror=https://docker.mirrors.ustc.edu.cn
命令运行后,拉取映像时一直没有速度,百度了一下,使用下面的命令,设置了映像仓库和base 映像后速度快了很多。
minikube start --image-mirror-country='cn' --image-repository='registry.cn-hangzhou.aliyuncs.com/google_containers' --base-image='registry.cn-hangzhou.aliyuncs.com/google_containers/kicbase:v0.0.28'
* minikube v1.25.2 on Centos 7.9.2009
* Using the docker driver based on existing profile
* Starting control plane node minikube in cluster minikube
* Pulling base image ...
> registry.cn-hangzhou.aliyun...: 355.77 MiB / 355.78 MiB 100.00% 1.70 MiB
* Creating docker container (CPUs=4, Memory=10240MB) ...
* Preparing Kubernetes v1.23.3 on Docker 20.10.8 ...
- kubelet.housekeeping-interval=5m
> kubectl.sha256: 64 B / 64 B [--------------------------] 100.00% ? p/s 0s
> kubelet.sha256: 64 B / 64 B [--------------------------] 100.00% ? p/s 0s
> kubeadm.sha256: 64 B / 64 B [--------------------------] 100.00% ? p/s 0s
> kubectl: 44.43 MiB / 44.43 MiB [---------------] 100.00% 2.95 MiB p/s 15s
> kubeadm: 43.12 MiB / 43.12 MiB [---------------] 100.00% 1.48 MiB p/s 29s
> kubelet: 118.75 MiB / 118.75 MiB [------------] 100.00% 1.96 MiB p/s 1m1s
- Generating certificates and keys ...
- Booting up control plane ...
- Configuring RBAC rules ...
* Verifying Kubernetes components...
- Using image registry.cn-hangzhou.aliyuncs.com/google_containers/storage-provisioner:v5
* Enabled addons: default-storageclass, storage-provisioner
! /usr/local/bin/kubectl is version 1.15.0, which may have incompatibilites with Kubernetes 1.23.3.
- Want kubectl v1.23.3? Try 'minikube kubectl -- get pods -A'
* Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
上面输出显示标红的部分提示kubectl和kubernetes的版本有兼容性问题,后面切换默认命名空间时总是不能成功,需要更新kubectl版本才可以。
查看一下k8s集群信息
[galaxykube@my_ob ~]$ kubectl cluster-info
Kubernetes master is running at
https://192.168.49.2:8443
CoreDNS is running at
https://192.168.49.2:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
k8s master节点已经运行
4 创建PolarDB-X集群
1) 创建PolarDBX operator命名空间
[galaxykube@my_ob ~]$ kubectl create namespace polar-oper-system
namespace/polar-oper-system created
2) 安装PolarDBX operator
这里通过 PolarDB-X 的 Helm Chart 仓库安装PolarDB-X Operator。首先在helm中添加PolarDB-X Operator
[galaxykube@my_ob ~]$ helm repo add polardbx
https://polardbx-charts.oss-cn-beijing.aliyuncs.com
"polardbx" has been added to your repositories
添加仓库成功后,安装PolarDB-X Operator
[galaxykube@my_ob ~]$ helm install --namespace polar-oper-system polardbx-operator polardbx/polardbx-operator
NAME: polardbx-operator
LAST DEPLOYED: Fri Jun 10 09:08:35 2022
NAMESPACE: polar-oper-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
polardbx-operator is installed. Please check the status of components:
kubectl get pods --namespace polar-oper-system
Now have fun with your first PolarDB-X cluster.
Here is the manifest for quick start:
```yaml
apiVersion: polardbx.aliyun.com/v1
kind: PolarDBXCluster
metadata:
name: quick-start
annotations:
polardbx/topology-mode-guide: quick-start
```
安装完成后,检查PolarDBX operator状态
[galaxykube@my_ob ~]$ kubectl get pods --namespace polar-oper-system
NAME READY STATUS RESTARTS AGE
polardbx-controller-manager-76758cc4f4-n4w7q 1/1 Running 0 78s
polardbx-hpfs-dxsvf 1/1 Running 0 78s
polardbx-tools-updater-l4sz9 1/1 Running 0 78s
等到上面显示的三个容器状态都为running后,PolarDBX operato成功完成启动,就可以创建PolarDB-X集群了,否则创建集群时会报错。报错信息如下
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "polardbxcluster-mutate.polardbx.aliyun.com": the server is currently unable to handle the request
创建集群使用参考文档的脚本
galaxykube@my_ob ~]$ echo "apiVersion: polardbx.aliyun.com/v1
> kind: PolarDBXCluster
> metadata:
> name: quick-start
> annotations:
> polardbx/topology-mode-guide: quick-start" | kubectl apply -f -
polardbxcluster.polardbx.aliyun.com/quick-start created
可以看到集群创建成功了。
显示一下集群创建进度
[galaxykube@my_ob ~]$ kubectl get polardbxcluster -w
NAME GMS CN DN CDC PHASE DISK AGE
quick-start 0/1 0/1 0/1 0/1 Creating 7m21s
这次实验集群创建卡在了这个阶段,集群创建状态迟迟没有变化,同参考文档中93s就完成了dn和cn的创建相比,这里过了7分钟了状态还没有任何变化,按crtl+C退出这个命令,检查一下pod状态
[galaxykube@my_ob ~]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
quick-start-4dn6-dn-0-single-0 0/3 ErrImagePull 0 7m6s
quick-start-4dn6-gms-single-0 0/3 ErrImagePull 0 7m6s
可以看到在映像拉取时遇到了问题,起初怀疑是映像源出现了问题,打算放弃了,刚好关掉了虚拟机,过了一段时间,又启动过了虚拟机,启动了minikube后,映像拉取竟然成功了,集群中的pod状态是下面的输出显示的样子
[galaxykube@my_ob ~]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
quick-start-4dn6-cdc-default-84f787f756-cpp7s 1/2 Running 1 (19m ago) 22m
quick-start-4dn6-cn-default-756d99d485-xl8hk 0/3 PodInitializing 0 (30s ago) 22m
quick-start-4dn6-dn-0-single-0 3/3 Running 3 (19m ago) 63m
quick-start-4dn6-gms-single-0 3/3 Running 3 (19m ago) 63m
查看一下PolarDB-X集群的创建进度
[galaxykube@my_ob ~]$ kubectl get polardbxcluster -w
NAME GMS CN DN CDC PHASE DISK AGE
quick-start 1/1 0/1 1/1 0/1 Creating 68m
quick-start 1/1 0/1 1/1 1/1 Creating 68m
quick-start 1/1 1/1 1/1 1/1 Running 2.4 GiB 68m
集群创建成功了。创建的集群名称为quick start,看一下用户访问集群的服务
[galaxykube@my_ob ~]$ kubectl get svc quick-start
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
quick-start ClusterIP 10.106.113.148 <none> 3306/TCP,8081/TCP 70m
将服务的3306端口转发到本地,这一步需要保持转发进程的存活,不能关闭窗口
[galaxykube@my_ob ~]$ kubectl port-forward svc/quick-start 3306
Forwarding from 127.0.0.1:3306 -> 3306
Forwarding from [::1]:3306 -> 3306
另开一个会话连接至虚拟机,切换到galaxykube用户下,查看polardbx_root 账号的密码
[galaxykube@my_ob ~]$ kubectl get secret quick-start -o jsonpath="{.data['polardbx_root']}" | base64 -d - | xargs echo "Password: "
Password: ********
用本地mysql客户端登录集群
[galaxykube@my_ob ~]$ mysql -h127.0.0.1 -P3306 -upolardbx_root -p*******
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 169
Server version: 8.0.3 Tddl Server (ALIBABA)
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
看一下现有数据库
MySQL [(none)]> show databases;
+--------------------+
| DATABASE |
+--------------------+
| information_schema |
+--------------------+
1 row in set (0.01 sec)
显示集群中的计算节点
MySQL [(none)]> show mpp;
+-------------+-----------------+------+--------+
| ID | NODE | ROLE | LEADER |
+-------------+-----------------+------+--------+
| quick-start | 172.17.0.7:3506 | W | Y |
+-------------+-----------------+------+--------+
1 row in set (0.01 sec)
显示集群中的存储节点
MySQL [(none)]> show storage;
+-----------------------+--------------------------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| STORAGE_INST_ID | LEADER_NODE | IS_HEALTHY | INST_KIND | DB_COUNT | GROUP_COUNT | STATUS | DELETABLE | DELAY | ACTIVE |
+-----------------------+--------------------------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| quick-start-4dn6-dn-0 | quick-start-4dn6-dn-0-single-0:14316 | true | MASTER | 1 | 2 | 0 | false | null | null |
| quick-start-4dn6-gms | quick-start-4dn6-gms-single-0:17276 | true | META_DB | 2 | 2 | 0 | false | null | null |
+-----------------------+--------------------------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
2 rows in set (0.03 sec)
可以看出gms节点和DN节点都显示为存储节点
5 集群扩容
PolarDB-X集群可以在线扩容和缩容,增加或减少计算、存储节点,不会影响到业务运行。这次只做了增加存储节点的实验,使用kubectl工具编辑相应的polardbxcluster 即可
kubectl edit polardbxcluster quick-start
找到下面的topology部分,更改每个节点的副本数量即可实现扩容或者缩容,这里把dn节点的数量改为2。保存后退出。
topology:
nodes:
cdc:
replicas: 1
template:
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: 100m
memory: 500Mi
cn:
replicas: 1
template:
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: 100m
memory: 1Gi
dn:
replicas: 2
template:
engine: galaxy
hostNetwork: true
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: 100m
memory: 500Mi
serviceType: ClusterIP
显示集群扩容进度
[galaxykube@my_ob ~]$ kubectl get polardbxcluster quick-start -w
NAME GMS CN DN CDC PHASE DISK AGE
quick-start 1/1 1/1 1/2 1/1 Upgrading 2.5 GiB 106m
quick-start 1/1 1/1 1/2 1/1 Upgrading 2.5 GiB 106m
quick-start 1/1 1/1 1/2 1/1 Upgrading 2.5 GiB 107m
quick-start 1/1 1/1 1/2 1/1 Upgrading 2.5 GiB 109m
quick-start 1/1 1/1 1/2 1/1 Upgrading 2.5 GiB 110m
quick-start 1/1 1/1 2/2 1/1 Upgrading 2.5 GiB 110m
quick-start 1/1 1/1 2/2 1/1 Upgrading 2.5 GiB 110m
quick-start 1/1 1/1 2/2 1/1 Upgrading 2.5 GiB 110m
quick-start 1/1 1/1 2/2 1/1 Upgrading 2.5 GiB 111m
quick-start 1/1 1/1 2/2 1/1 Running 2.5 GiB 111m
quick-start 1/1 1/1 2/2 1/1 Running 3.7 GiB 111m
集群阶段开始显示为upgrading,DN的节点数量变为2个,升级完成后,状态变为running,磁盘数据量由2.5GiB变为3.7GiB。
登录数据库,显示集群中dn
MySQL [(none)]> show storage;
+-----------------------+--------------------------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| STORAGE_INST_ID | LEADER_NODE | IS_HEALTHY | INST_KIND | DB_COUNT | GROUP_COUNT | STATUS | DELETABLE | DELAY | ACTIVE |
+-----------------------+--------------------------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| quick-start-4dn6-dn-0 | quick-start-4dn6-dn-0-single-0:14316 | true | MASTER | 1 | 2 | 0 | false | null | null |
| quick-start-4dn6-dn-1 | quick-start-4dn6-dn-1-single-0:14933 | true | MASTER | 1 | 1 | 0 | true | null | null |
| quick-start-4dn6-gms | quick-start-4dn6-gms-single-0:17276 | true | META_DB | 2 | 2 | 0 | false | null | null |
+-----------------------+--------------------------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
3 rows in set (0.03 sec)
集群中已经有2个dn了。