Ceph Jewel 手动升级Luminous-阿里云开发者社区

Ceph Jewel 手动升级Luminous

测试环境

节点IP	节点功能
192.168.1.10	mon，osd，rgw
192.168.1.11	mon，osd，rgw
192.168.1.12	mon，osd，rgw

测试准备

1，配置升级Luminous的yum源

# cat ceph-luminous.repo 
[ceph]
name=x86_64
baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/x86_64/
gpgcheck=0

[ceph-noarch]
name=noarch
baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/noarch/
gpgcheck=0

[ceph-arrch64]
name=arrch64
baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/aarch64/
gpgcheck=0

[ceph-SRPMS]
name=SRPMS
baseurl=https://mirrors.aliyun.com/ceph/rpm-luminous/el7/SRPMS/
gpgcheck=0

把生成的Luminous源文件拷贝产品产品到每一个节点上，并删除原本的jewel版yum源

# ansible node -m copy -a 'src=ceph-luminous.repo dest=/etc/yum.repos.d/ceph-luminous.repo'

# ansible node -m file -a 'name=/etc/yum.repos.d/ceph-jewel.repo state=absent'

2，设置sortbitwis

如果未设置，升级过程中可能会出现数据丢失的情况

# ceph osd set sortbitwise

3，设置noout

为了防止升级过程中出现数据重平衡，升级完成后取消设置即可

# ceph osd set noout

设置完成后集群状态如下

# ceph -s

    cluster 0d5eced9-8baa-48be-83ef-64a7ef3a8301

     health HEALTH_WARN

            noout flag(s) set

     monmap e1: 3 mons at {node1=192.168.1.10:6789/0,node2=192.168.1.11:6789/0,node3=192.168.1.12:6789/0}

            election epoch 26, quorum 0,1,2 node1,node2,node3

     osdmap e87: 9 osds: 9 up, 9 in

            flags noout,sortbitwise,require_jewel_osds

      pgmap v267: 112 pgs, 7 pools, 3084 bytes data, 173 objects

            983 MB used, 133 GB / 134 GB avail

                 112 active+clean

4，Luminous版的ceph需要指定允许池删除的参数，在每个mon节点的ceph配置文件中添加“mon allow pool delete = true”

# ansible node -m shell -a 'echo "mon allow pool delete = true" >> /etc/ceph/ceph.conf'

开始升级

1，确认当前集群中安装的ceph软件包版本

# ansible node -m shell -a 'rpm -qa | grep ceph'

 [WARNING]: Consider using yum, dnf or zypper module rather than running rpm

node1 | SUCCESS | rc=0 >>

ceph-selinux-10.2.11-0.el7.x86_64

ceph-10.2.11-0.el7.x86_64

ceph-deploy-1.5.39-0.noarch

libcephfs1-10.2.11-0.el7.x86_64

python-cephfs-10.2.11-0.el7.x86_64

ceph-base-10.2.11-0.el7.x86_64

ceph-mon-10.2.11-0.el7.x86_64

ceph-osd-10.2.11-0.el7.x86_64

ceph-radosgw-10.2.11-0.el7.x86_64

ceph-common-10.2.11-0.el7.x86_64

ceph-mds-10.2.11-0.el7.x86_64


node3 | SUCCESS | rc=0 >>

ceph-mon-10.2.11-0.el7.x86_64

ceph-radosgw-10.2.11-0.el7.x86_64

ceph-common-10.2.11-0.el7.x86_64

libcephfs1-10.2.11-0.el7.x86_64

python-cephfs-10.2.11-0.el7.x86_64

ceph-selinux-10.2.11-0.el7.x86_64

ceph-mds-10.2.11-0.el7.x86_64

ceph-10.2.11-0.el7.x86_64

ceph-base-10.2.11-0.el7.x86_64

ceph-osd-10.2.11-0.el7.x86_64

 
node2 | SUCCESS | rc=0 >>

ceph-mds-10.2.11-0.el7.x86_64

python-cephfs-10.2.11-0.el7.x86_64

ceph-base-10.2.11-0.el7.x86_64

ceph-mon-10.2.11-0.el7.x86_64

ceph-osd-10.2.11-0.el7.x86_64

ceph-radosgw-10.2.11-0.el7.x86_64

ceph-common-10.2.11-0.el7.x86_64

ceph-selinux-10.2.11-0.el7.x86_64

ceph-10.2.11-0.el7.x86_64

libcephfs1-10.2.11-0.el7.x86_64

2，确认当前集群使用的ceph版本

# ansible node -m shell -a 'for i in `ls /var/run/ceph/ | grep "ceph-mon.*asok"` ; do ceph --admin-daemon /var/run/ceph/$i --version ; done'

node1 | SUCCESS | rc=0 >>

ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)


node2 | SUCCESS | rc=0 >>

ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
 

node3 | SUCCESS | rc=0 >>

ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)

3，升级软件包

# ansible node -m yum -a 'name=ceph state=latest'

4，升级完成后，查看当前集群节点中安装的软件包版本

# ansible node -m shell -a 'rpm -qa | grep ceph'

 [WARNING]: Consider using yum, dnf or zypper module rather than running rpm

node2 | SUCCESS | rc=0 >>

ceph-base-12.2.10-0.el7.x86_64

ceph-osd-12.2.10-0.el7.x86_64

python-cephfs-12.2.10-0.el7.x86_64

ceph-common-12.2.10-0.el7.x86_64

ceph-selinux-12.2.10-0.el7.x86_64

ceph-mon-12.2.10-0.el7.x86_64

ceph-mds-12.2.10-0.el7.x86_64

ceph-radosgw-12.2.10-0.el7.x86_64

libcephfs2-12.2.10-0.el7.x86_64

ceph-mgr-12.2.10-0.el7.x86_64

ceph-12.2.10-0.el7.x86_64


node1 | SUCCESS | rc=0 >>

ceph-base-12.2.10-0.el7.x86_64

ceph-osd-12.2.10-0.el7.x86_64

ceph-deploy-1.5.39-0.noarch

python-cephfs-12.2.10-0.el7.x86_64

ceph-common-12.2.10-0.el7.x86_64

ceph-selinux-12.2.10-0.el7.x86_64

ceph-mon-12.2.10-0.el7.x86_64

ceph-mds-12.2.10-0.el7.x86_64

ceph-radosgw-12.2.10-0.el7.x86_64

libcephfs2-12.2.10-0.el7.x86_64

ceph-mgr-12.2.10-0.el7.x86_64

ceph-12.2.10-0.el7.x86_64


node3 | SUCCESS | rc=0 >>

python-cephfs-12.2.10-0.el7.x86_64

ceph-common-12.2.10-0.el7.x86_64

ceph-mon-12.2.10-0.el7.x86_64

ceph-radosgw-12.2.10-0.el7.x86_64

libcephfs2-12.2.10-0.el7.x86_64

ceph-base-12.2.10-0.el7.x86_64

ceph-mgr-12.2.10-0.el7.x86_64

ceph-osd-12.2.10-0.el7.x86_64

ceph-12.2.10-0.el7.x86_64

ceph-selinux-12.2.10-0.el7.x86_64

ceph-mds-12.2.10-0.el7.x86_64

5，分别对所有的mon，osd，rgw进程进行重启

node1节点

# systemctl restart ceph-mon@node1

# systemctl restart ceph-osd@{0,1,2}

# systemctl restart ceph-radosgw@rgw.node1

node2节点

# systemctl restart ceph-mon@node2

# systemctl restart ceph-osd@{3,4,5}

# systemctl restart ceph-radosgw@rgw.node2

node3节点

# systemctl restart ceph-mon@node3

# systemctl restart ceph-osd@{6,7,8}

# systemctl restart ceph-radosgw@rgw.node3

6，调整require_osd_release

此时查看集群状态信息如下

# ceph -s

  cluster:

    id:     0d5eced9-8baa-48be-83ef-64a7ef3a8301

    health: HEALTH_WARN

            noout flag(s) set

            all OSDs are running luminous or later but require_osd_release < luminous

            no active mgr

  services:

    mon: 3 daemons, quorum node1,node2,node3

    mgr: no daemons active

    osd: 9 osds: 9 up, 9 in

         flags noout

  data:

    pools:   7 pools, 112 pgs

    objects: 189 objects, 3.01KiB

    usage:   986MiB used, 134GiB / 135GiB avail

    pgs:     112 active+clean

需要手动调整require_osd_release

# ceph osd require-osd-release luminous

7，取消noout设置

# ceph osd unset noout

再次查看集群状态如下

# ceph -s

  cluster:

    id:     0d5eced9-8baa-48be-83ef-64a7ef3a8301

    health: HEALTH_WARN

            no active mgr

  services:

    mon: 3 daemons, quorum node1,node2,node3

    mgr: no daemons active

    osd: 9 osds: 9 up, 9 in

  data:

    pools:   0 pools, 0 pgs

    objects: 0 objects, 0B

    usage:   0B used, 0B / 0B avail

    pgs:

8，配置mgr

1）生成密钥

# ceph auth get-or-create mgr.node1 mon 'allow *' osd 'allow *'

[mgr.node1]

        key = AQC0IA9c9X31IhAAdQRm3zR5r/nl3b7+WOwZjQ==

2）创建数据目录

# mkdir /var/lib/ceph/mgr/ceph-node1/

3）添加密钥

# ceph auth get mgr.node1 -o /var/lib/ceph/mgr/ceph-node1/keyring

exported keyring for mgr.node1

4）设置服务开机自启

# systemctl enable ceph-mgr@node1

Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@node1.service to /usr/lib/systemd/system/ceph-mgr@.service.

5）启动mgr

# systemctl start ceph-mgr@node1

6）其他mon节点通过同样的方式配置一下mgr，再次查看集群状态

# ceph -s

  cluster:

    id:     0d5eced9-8baa-48be-83ef-64a7ef3a8301

    health: HEALTH_OK

  services:

    mon: 3 daemons, quorum node1,node2,node3

    mgr: node1(active), standbys: node2, node3

    osd: 9 osds: 9 up, 9 in

    rgw: 3 daemons active

  data:

    pools:   7 pools, 112 pgs

    objects: 189 objects, 3.01KiB

    usage:   986MiB used, 134GiB / 135GiB avail

    pgs:     112 active+clean

7）开启mgr的dashboard模块，dashboard提供一个web界面可以对集群状态进行监控

# ceph mgr module enable dashboard
# ceph mgr module ls
{
    "enabled_modules": [
        "balancer",
        "dashboard",
        "restful",
        "status"
    ],
    "disabled_modules": [
        "influx",
        "localpool",
        "prometheus",
        "selftest",
        "zabbix"
    ]
}
# ceph mgr services
{
    "dashboard": "http://node1:7000/"
}

8）访问dashboard

c0334f28406341638877fa2e75fe7a5e25b46eff

使用deploy升级集群

如果ceph集群使用的deploy部署，也可以通过部署进行升级，软件包的升级命令如下，其他的操作步骤都是类似的，这里不再赘述。

#ceph-deploy install --release lumious node1 node2 node3

#ceph-deploy --overwrite-conf mgr create node1 node2 node3

Ceph Jewel 手动升级Luminous

测试环境

测试准备

开始升级

使用deploy升级集群

热门文章

最新文章

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Ceph Jewel 手动升级Luminous

测试环境

测试准备

开始升级

使用deploy升级集群

热门文章

最新文章

相关电子书