OpenStack-Cinder multi backend

补充内容:
1.修改 crushmap 有两种方式:在线修改和离线修改
2.为了保险起见一般都是采用离线修改,也就是导出来修改
3.导出默认的 crushmap,这个是二进制文件打不开
ceph osd getcrushmap -o {compiled-crushmap-filename}
4.将刚才的二进制文件转换成可视化的文本文件
crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}
5.转换成可视化的文本文件之后,建议不要把默认的二进制文件删掉避免修改 crushmap 造成 ceph 集群瘫痪,这样我们还留了一个备份
6.修改 crushmap,也就是下面讲述的 crushmap 详解,根据实际情况修改
7.将刚才修改的可视化文本文件转换成二进制文件
crushtool -c {decompiled-crush-map-filename} -o {compiled-crush-map-filename}
8.设置 OSD 的 crushmap,也就是把刚才转换的二进制文件让他生效
ceph osd setcrushmap -i {compiled-crushmap-filename}
9.创建 ssd 和 sata 两个 pool
ceph osd pool create ssd 128 
ceph osd pool create sata 128 
10.创建完 ssd 和 sata 两个 pool 之后更新下 cinder 密钥的权限
ceph auth caps client.cinder mon 'allow r'  osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, 
allow rx pool=images, allow rwx pool=ssd, allow rwx pool=sata'

Cinder multi backend思路以及步骤:

wpsE10E.tmp

一、根据实际情况在ceph上创建相应的pool,例如ssd,sata等等

二、根据实际情况编写crushmap,下面是我摘抄sebastien-han的一篇配置ssd,sata的文章里面的crushmap。

wKiom1TIqBKBGdouAAL6oNjli9o315.jpg

将其配置摘录如下:

##

# OSD SATA DECLARATION

##

host ceph-osd2-sata {

  id -2   # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item osd.0 weight 1.000

  item osd.3 weight 1.000

}

host ceph-osd1-sata {

  id -3   # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item osd.2 weight 1.000

  item osd.5 weight 1.000

}

host ceph-osd0-sata {

  id -4   # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item osd.1 weight 1.000

  item osd.4 weight 1.000

}

##

# OSD SSD DECLARATION

##

host ceph-osd2-ssd {

  id -22    # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item osd.6 weight 1.000

  item osd.9 weight 1.000

}

host ceph-osd1-ssd {

  id -23    # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item osd.8 weight 1.000

  item osd.11 weight 1.000

}

host ceph-osd0-ssd {

  id -24    # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item osd.7 weight 1.000

  item osd.10 weight 1.000

}

##

# SATA ROOT DECLARATION

##

root sata {

  id -1   # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item ceph-osd2-sata weight 2.000

  item ceph-osd1-sata weight 2.000

  item ceph-osd0-sata weight 2.000

}

##

# SATA ROOT DECLARATION

##

root ssd {

  id -21    # do not change unnecessarily

  # weight 0.000

  alg straw

  hash 0  # rjenkins1

  item ceph-osd2-ssd weight 2.000

  item ceph-osd1-ssd weight 2.000

  item ceph-osd0-ssd weight 2.000

}

##

# SSD RULE DECLARATION

##

# rules

rule ssd {

ruleset 0

type replicated

min_size 1

max_size 10

step take ssd

step chooseleaf firstn 0 type host

step emit

}

##

# SATA RULE DECLARATION

##

rule sata {

ruleset 1

type replicated

min_size 1

max_size 10

step take sata

step chooseleaf firstn 0 type host

step emit

}

这个crushmap是将SSD和SATA的盘分到了多个逻辑host然后再针对于逻辑host进行bucket分组,一个bucket对应一个rule。

默认的crushmap的话应该是下面这样的一个组织架构:

wpsE120.tmp

三、他将所有的host分为了一个bucket然后针对于bucket做了一个default的rule。

设置pool的crushmap

ceph osd pool set [pool name] crush_ruleset 0 #注解:这里的crush_ruleset 0 是你在crushmap里面的rule选项

ceph osd pool set [pool name] crush_ruleset 1 #注解:这里的crush_ruleset 1 是你在crushmap里面的rule选项

至此ceph端已经配置完毕,接下来配置cinder端

四、在cinder-volumes节点配置

vi /etc/cinder/cinder.conf 添加如下

enabled_backends=ssd,sata

[ssd]volume_driver=cinder.volume.driver.RBDDriverrbd_pool=ssdvolume_backend_name=ssd

rbd_ceph_conf = /etc/ceph/ceph.conf

rbd_flatten_volume_from_snapshot = false

rbd_max_clone_depth = 5

rbd_store_chunk_size = 4

rados_connect_timeout = -1

glance_api_version = 2

rbd_user = cinder

rbd_secret_uuid = XXXXXXXXX

[sata]volume_driver=cinder.volume.driver.RBDDriverrbd_pool=satavolume_backend_name=sata

rbd_ceph_conf = /etc/ceph/ceph.conf

rbd_flatten_volume_from_snapshot = false

rbd_max_clone_depth = 5

rbd_store_chunk_size = 4

rados_connect_timeout = -1

glance_api_version = 2

rbd_user = cinder

rbd_secret_uuid = XXXXXXXXX

五、创建两个cinder 卷类型

cinder type-create ssd

cinder type-create ssta

root@controller:~# cinder type-list

+--------------------------------------+------+

|                  ID                  | Name |

+--------------------------------------+------+

| 707e887d-95e5-45ca-b7df-53a51fadf458 | ssd  |

| 82c32938-f1e5-4e22-a4b9-b0920c4543e7 | sata |

+--------------------------------------+------+

六、设置卷类型的key键值

cinder type-key ssd set volume_backend_name=ssd

cinder type-key ssd set volume_backend_name=sata

root@controller:~# cinder  extra-specs-list

+--------------------------------------+------+-----------------------------------+

|                  ID                  | Name |            extra_specs            |

+--------------------------------------+------+-----------------------------------+

| 707e887d-95e5-45ca-b7df-53a51fadf458 | ssd  |  {u'volume_backend_name': u'ssd'} |

| 82c32938-f1e5-4e22-a4b9-b0920c4543e7 | sata | {u'volume_backend_name': u'sata'} |

+--------------------------------------+------+-----------------------------------+

七、最后重启服务

restart cinder-api ; sudo restart cinder-scheduler

在cinder-volumes节点

restart cinder-volume

八、验证是否成功

wpsE131.tmp

wpsE141.tmp

wpsE142.tmp

故障总结:

在修改ceph的crushmap的过程中一般会遇到以下几种情况:

wpsE153.tmp

这种情况呢是因为我的pool size设置的是2,而我改了crushmap之后变成了已单个host为bucket的所以要把pool size设置成1.

但是当我改了pool size之后还是会有些问题,他状态变成了以下,Pg正常重新映射关系,可能pg映射的osd集合与根据crush计算的不一样。

wpsE154.tmp

查看remapped状态的pg都在哪些osd上

wpsE164.tmp

根据pg dump出来的显示虽然我把pool size设置成了1,但是还有些pg size是2,pg 1.7e的acting集合有2和0两个osd,于是把osd2和osd0重启,让这2个osd上的pg重新peering

wpsE165.tmp

wpsE176.tmp

wpsE177.tmp

状态在逐渐变好,最后我重启了全部osd

wpsE188.tmp