【项目实践】多人姿态估计实践（代码+权重=一键运行）（一）-阿里云开发者社区

【项目实践】多人姿态估计实践（代码+权重=一键运行）（一）

2023-05-18 187

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 【项目实践】多人姿态估计实践（代码+权重=一键运行）（一）

1、姿态估计的简介

姿态估计问题就是确定某一三维目标物体的方位指向问题。姿态估计在机器人视觉、动作跟踪和单照相机定标等很多领域都有应用。在不同领域用于姿态估计的传感器是不一样的。

2、Realtime Multi-Person 2D Human Pose Estimation using Part Affinity Fields

2.1、模型结构

网络分为两路结构，一路是上面的卷积层，用来获得置信图；一路是下面的卷积层，用来获得PAFs。网络分为多个stage，每一个stage结束的时候都有中继监督。每一个stage结束之后，S以及L都和stage1中的F合并。上下两路的loss都是计算预测和理想值之间的L2 loss。

from keras.models import Model
from keras.layers.merge import Concatenate
from keras.layers import Activation, Input, Lambda
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import Multiply
from keras.regularizers import l2
from keras.initializers import random_normal,constant
def relu(x): return Activation('relu')(x)
def conv(x, nf, ks, name, weight_decay):
    kernel_reg = l2(weight_decay[0]) if weight_decay else None
    bias_reg = l2(weight_decay[1]) if weight_decay else None
    x = Conv2D(nf, (ks, ks), padding='same', name=name,
               kernel_regularizer=kernel_reg,
               bias_regularizer=bias_reg,
               kernel_initializer=random_normal(stddev=0.01),
               bias_initializer=constant(0.0))(x)
    return x
def pooling(x, ks, st, name):
    x = MaxPooling2D((ks, ks), strides=(st, st), name=name)(x)
    return x
def vgg_block(x, weight_decay):
    # Block 1
    x = conv(x, 64, 3, "conv1_1", (weight_decay, 0))
    x = relu(x)
    x = conv(x, 64, 3, "conv1_2", (weight_decay, 0))
    x = relu(x)
    x = pooling(x, 2, 2, "pool1_1")
    # Block 2
    x = conv(x, 128, 3, "conv2_1", (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 3, "conv2_2", (weight_decay, 0))
    x = relu(x)
    x = pooling(x, 2, 2, "pool2_1")
    # Block 3
    x = conv(x, 256, 3, "conv3_1", (weight_decay, 0))
    x = relu(x)
    x = conv(x, 256, 3, "conv3_2", (weight_decay, 0))
    x = relu(x)
    x = conv(x, 256, 3, "conv3_3", (weight_decay, 0))
    x = relu(x)
    x = conv(x, 256, 3, "conv3_4", (weight_decay, 0))
    x = relu(x)
    x = pooling(x, 2, 2, "pool3_1")
    # Block 4
    x = conv(x, 512, 3, "conv4_1", (weight_decay, 0))
    x = relu(x)
    x = conv(x, 512, 3, "conv4_2", (weight_decay, 0))
    x = relu(x)
    # Additional non vgg layers
    x = conv(x, 256, 3, "conv4_3_CPM", (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 3, "conv4_4_CPM", (weight_decay, 0))
    x = relu(x)
    return x
def stage1_block(x, num_p, branch, weight_decay):
    # Block 1
    x = conv(x, 128, 3, "Mconv1_stage1_L%d" % branch, (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 3, "Mconv2_stage1_L%d" % branch, (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 3, "Mconv3_stage1_L%d" % branch, (weight_decay, 0))
    x = relu(x)
    x = conv(x, 512, 1, "Mconv4_stage1_L%d" % branch, (weight_decay, 0))
    x = relu(x)
    x = conv(x, num_p, 1, "Mconv5_stage1_L%d" % branch, (weight_decay, 0))
    return x
def stageT_block(x, num_p, stage, branch, weight_decay):
    # Block 1
    x = conv(x, 128, 7, "Mconv1_stage%d_L%d" % (stage, branch), (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 7, "Mconv2_stage%d_L%d" % (stage, branch), (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 7, "Mconv3_stage%d_L%d" % (stage, branch), (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 7, "Mconv4_stage%d_L%d" % (stage, branch), (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 7, "Mconv5_stage%d_L%d" % (stage, branch), (weight_decay, 0))
    x = relu(x)
    x = conv(x, 128, 1, "Mconv6_stage%d_L%d" % (stage, branch), (weight_decay, 0))
    x = relu(x)
    x = conv(x, num_p, 1, "Mconv7_stage%d_L%d" % (stage, branch), (weight_decay, 0))
    return x
def apply_mask(x, mask1, mask2, num_p, stage, branch):
    w_name = "weight_stage%d_L%d" % (stage, branch)
    if num_p == 38:
        w = Multiply(name=w_name)([x, mask1]) # vec_weight
    else:
        w = Multiply(name=w_name)([x, mask2])  # vec_heat
    return w
def get_training_model(weight_decay):
    stages = 6
    np_branch1 = 38
    np_branch2 = 19
    img_input_shape = (None, None, 3)
    vec_input_shape = (None, None, 38)
    heat_input_shape = (None, None, 19)
    inputs = []
    outputs = []
    img_input = Input(shape=img_input_shape)
    vec_weight_input = Input(shape=vec_input_shape)
    heat_weight_input = Input(shape=heat_input_shape)
    inputs.append(img_input)
    inputs.append(vec_weight_input)
    inputs.append(heat_weight_input)
    img_normalized = Lambda(lambda x: x / 256 - 0.5)(img_input) # [-0.5, 0.5]
    # VGG
    stage0_out = vgg_block(img_normalized, weight_decay)
    # stage 1 - branch 1 (PAF)
    stage1_branch1_out = stage1_block(stage0_out, np_branch1, 1, weight_decay)
    w1 = apply_mask(stage1_branch1_out, vec_weight_input, heat_weight_input, np_branch1, 1, 1)
    # stage 1 - branch 2 (confidence maps)
    stage1_branch2_out = stage1_block(stage0_out, np_branch2, 2, weight_decay)
    w2 = apply_mask(stage1_branch2_out, vec_weight_input, heat_weight_input, np_branch2, 1, 2)
    x = Concatenate()([stage1_branch1_out, stage1_branch2_out, stage0_out])
    outputs.append(w1)
    outputs.append(w2)
    # stage sn >= 2
    for sn in range(2, stages + 1):
        # stage SN - branch 1 (PAF)
        stageT_branch1_out = stageT_block(x, np_branch1, sn, 1, weight_decay)
        w1 = apply_mask(stageT_branch1_out, vec_weight_input, heat_weight_input, np_branch1, sn, 1)
        # stage SN - branch 2 (confidence maps)
        stageT_branch2_out = stageT_block(x, np_branch2, sn, 2, weight_decay)
        w2 = apply_mask(stageT_branch2_out, vec_weight_input, heat_weight_input, np_branch2, sn, 2)
        outputs.append(w1)
        outputs.append(w2)
        if (sn < stages):
            x = Concatenate()([stageT_branch1_out, stageT_branch2_out, stage0_out])
    model = Model(inputs=inputs, outputs=outputs)
    return model
def get_testing_model():
    stages = 6
    np_branch1 = 38
    np_branch2 = 19
    img_input_shape = (None, None, 3)
    img_input = Input(shape=img_input_shape)
    img_normalized = Lambda(lambda x: x / 256 - 0.5)(img_input) # [-0.5, 0.5]
    # VGG
    stage0_out = vgg_block(img_normalized, None)
    # stage 1 - branch 1 (PAF)
    stage1_branch1_out = stage1_block(stage0_out, np_branch1, 1, None)
    # stage 1 - branch 2 (confidence maps)
    stage1_branch2_out = stage1_block(stage0_out, np_branch2, 2, None)
    x = Concatenate()([stage1_branch1_out, stage1_branch2_out, stage0_out])
    # stage t >= 2
    stageT_branch1_out = None
    stageT_branch2_out = None
    for sn in range(2, stages + 1):
        stageT_branch1_out = stageT_block(x, np_branch1, sn, 1, None)
        stageT_branch2_out = stageT_block(x, np_branch2, sn, 2, None)
        if (sn < stages):
            x = Concatenate()([stageT_branch1_out, stageT_branch2_out, stage0_out])
    model = Model(inputs=[img_input], outputs=[stageT_branch1_out, stageT_branch2_out])
    return model

Loss方程中有一个空间上的加权，是因为有些数据集没有完全标注所有的人，用其提供的mask说明有些区域是可能包含没有标记的人。最终的loss是各个阶段的loss相加。

论文在MPII和COCO数据集上都取得了非常好的效果，制作的demo效果也非常好，只是对尺度比较小的人检测效果不如其他算法。

论文所提方法

1，使用置信图进行关节检测

每一个关节对应一个置信图，图像每一个像素点都有一个置信度，置信图中每点的值与ground truth的距离相关。关于多个人的检测，是将K个人的置信图合并取该点每个人的最大值。这里使用最大而不是平均是因为即使峰值很近也不会影响精度。测试阶段使用非极大值抑制来获得身体部分的候选。

2，使用PAF进行身体部分组合

对于多个人的问题，检测了不同人的部分，但是还需要将每个人的身体分别组合在一起形成full-body，使用的方法就是论文的精华PAF。这个方法的好处在于将位置和方向信息都包含了。每一种limb（肢）在关联的两个body part之间都有一个亲和区域，其中的每一个像素都有一个2D 向量的描述方向。亲和区map的维度是w*h*2 (因为向量是二维的)。若某个点有多人重叠，则将k个人的vector求和，再除以人数。

3，bottom-up方法

在得到了置信图和PAF之后，需要考虑如何利用这些信息找到两两body-part最优化的连接方式，这转换为图论问题。论文使用的是Hungarian algorithm。图中的节点就是body part中的检测候选，边就是这些候选最优的连接方式。每条边上的权值就是亲和区的聚合。因此这样的匹配问题就是找到一组连接使

得没有两条边是共享一个节点的，也就是找到权值最大的边连接方式。

【项目实践】多人姿态估计实践（代码+权重=一键运行）（一）

1、姿态估计的简介

2、Realtime Multi-Person 2D Human Pose Estimation using Part Affinity Fields

2.1、模型结构

热门文章

最新文章

相关电子书

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

【项目实践】多人姿态估计实践（代码+权重=一键运行）（一）

1、姿态估计的简介

2、Realtime Multi-Person 2D Human Pose Estimation using Part Affinity Fields

2.1、模型结构

热门文章

最新文章

相关电子书