ResNetV2模型复现--CVPR2016

简介: 关于残差结构的不同尝试
参考论文:Identity Mappings in Deep Residual Networks

作者:==Kaiming He==, Xiangyu Zhang, Shaoqing Ren, Jian Sun

这篇是对ResNet经典之作Deep Residual Learning for Image Recognition的一些改进。

何恺明大神在这篇论文中提出了一种新的残差单元。我们将这篇论文中的ResNet结构称为ResNet-V2。

1、残差单元优化

image-20220831152100813

  灰色箭头表示信息最容易传播的路径。右:1001-layerResNets的CIFAR-10训练曲线。实线表示测试误差(右边的y轴),虚线表示训练损失(左边的y轴)。新的残差单元使ResNet-1001更容易训练。

  图中的 Iterations 表示迭代次数;Test Error 表示测试集错误率。

  (a)original 表示原始的 ResNet 的残差结构,(b)proposed 表示新的 ResNet 的残差结构。主要差别就是(a)结构先卷积后进行 BN 和激活函数计算,最后执行 addition 后再进行ReLU 计算; ==(b)结构先进行 BN 和激活函数计算后卷积,把 addition 后的 ReLU 计算放到了残差结构内部。==

  作者使用这两种不同的结构在 CIFAR-10 数据集上做测试,模型用的是 1001层的 ResNet 模型。从图中结果我们可以看出,(b)proposed 的测试集错误率明显更低一些,达到了 4.92%的错误率,(a)original 的测试集错误率是 7.61%

2、关于残差结构的不同尝试

image-20220831152600292

  图2。表1中使用的各种类型的快捷连接。灰色箭头表示信息传播的最容易路径。(b-f)中的快捷连接被不同的组件阻碍。为了简化插图,我们不显示BN层,这里所有单位均采用权值层之后的BN层。

  图中(a),(b),(c),(d),(e),(f)都是作者对残差结构的 shortcut 部分进行的不同尝试 ,作者对不同 shortcut 结构的尝试结果如下表所示 。

image-20220831152717097

  表1。使用ResNet-110[1]在CIFAR-10测试集上的分类错误,对所有残差单元应用了不同类型的shortcut connections。当测试误差高于20%时,我们报告“失败”。

  作者用不同 shortcut 结构的 ResNet-110 在 CIFAR-10 数据集上做测试,发现最原始的(a)original 结构是最好的,也就是 identity mapping 恒等映射是最好的

3、关于激活的尝试

image-20220831153204788

   图4。激活的各种用法见表2。所有这些单元都由相同的组件组成,只是顺序不同

image-20220831153227325

  表2。使用不同激活函数的CIFAR-10测试集上的分类误差(%)。

  最好的结果是(e)full pre-activation,其次到(a)original。(a)original 的残差结构是应用在最原始的 ResNet 中的残差结构;(e)full pre-activation 的残差结构就是我们前面介绍的ResNet-V2 中的残差结构。

4、模型复现

  这里我是仿照的Tensorflow官方源代码实现的,个人对代码做了一些精简,因为源码中考虑到了很多情况,这里就简单复现下。

  注意:ResNet50V2、ResNet101V2与ResNet152V2的搭建方式完全一样,区别在于堆叠Residual Block的数量不同

import tensorflow as tf
import tensorflow.keras.layers as layers
from tensorflow.keras.models import Model

4.1 Residual Block

"""
残差块
  Arguments:
      x: 输入张量
      filters: integer, filters of the bottleneck layer.
      kernel_size: 默认是3, kernel size of the bottleneck layer.
      stride: default 1, stride of the first layer.
      conv_shortcut: default False, use convolution shortcut if True,
        otherwise identity shortcut.
      name: string, block label.
  Returns:
    Output tensor for the residual block.
"""
def block2(x, filters, kernel_size=3, stride=1, conv_shortcut=False, name=None):
    preact = layers.BatchNormalization(name=name + '_preact_bn')(x)
    preact = layers.Activation('relu', name=name + '_preact_relu')(preact)

    if conv_shortcut:
        shortcut = layers.Conv2D(4 * filters, 1, strides=stride, name=name + '_0_conv')(preact)
    else:
        shortcut = layers.MaxPooling2D(1, strides=stride)(x) if stride > 1 else x

    x = layers.Conv2D(filters, 1, strides=1, use_bias=False, name=name + '_1_conv')(preact)
    x = layers.BatchNormalization(name=name + '_1_bn')(x)
    x = layers.Activation('relu', name=name + '_1_relu')(x)

    x = layers.ZeroPadding2D(padding=((1, 1), (1, 1)), name=name + '_2_pad')(x)
    x = layers.Conv2D(filters,
                      kernel_size,
                      strides=stride,
                      use_bias=False,
                      name=name + '_2_conv')(x)
    x = layers.BatchNormalization(name=name + '_2_bn')(x)
    x = layers.Activation('relu', name=name + '_2_relu')(x)

    x = layers.Conv2D(4 * filters, 1, name=name + '_3_conv')(x)
    x = layers.Add(name=name + '_out')([shortcut, x])
    return x

我大概对比了下和ResNet50的源代码,发现关于identity shortcut这里稍微有点区别。

ResNet50中是这样的:

if conv_shortcut:
    shortcut = layers.Conv2D(
        4 * filters, 1, strides=stride, name=name + '_0_conv')(x)
    shortcut = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + '_0_bn')(shortcut)
  else:
    shortcut = x

而ResNet50V2中是这样的:

if conv_shortcut:
    shortcut = layers.Conv2D(
        4 * filters, 1, strides=stride, name=name + '_0_conv')(preact)
  else:
    shortcut = layers.MaxPooling2D(1, strides=stride)(x) if stride > 1 else x、

  除此之外,在ResNet50源代码中是没有预激活的(上面代码中的preact)。区别很容易发现,但我还是感觉原版ResNet50比ResNet50V2的代码看起来更优雅。

4.2 堆叠Residual Block

def stack2(x, filters, blocks, stride1=2, name=None):
    x = block2(x, filters, conv_shortcut=True, name=name + '_block1')
    for i in range(2, blocks):
        x = block2(x, filters, name=name + '_block' + str(i))
    x = block2(x, filters, stride=stride1, name=name + '_block' + str(blocks))
    return x

4.3 ResNet50V2架构复现

def ResNet50V2(include_top=True,  # 是否包含位于网络顶部的全连接层
               preact=True,  # 是否使用预激活
               use_bias=True,  # 是否对卷积层使用偏置
               weights='imagenet',
               input_tensor=None,  # 可选的keras张量,用作模型的图像输入
               input_shape=None,
               pooling=None,
               classes=1000,  # 用于分类图像的可选类数
               classifier_activation='softmax'):  # 分类层激活函数
    img_input = layers.Input(shape=input_shape)
    x = layers.ZeroPadding2D(padding=((3, 3), (3, 3)), name='conv1_pad')(img_input)
    x = layers.Conv2D(64, 7, strides=2, use_bias=use_bias, name='conv1_conv')(x)

    if not preact:
        x = layers.BatchNormalization(name='conv1_bn')(x)
        x = layers.Activation('relu', name='conv1_relu')(x)

    x = layers.ZeroPadding2D(padding=((1, 1), (1, 1)), name='pool1_pad')(x)
    x = layers.MaxPooling2D(3, strides=2, name='pool1_pool')(x)

    x = stack2(x, 64, 3, name='conv2')
    x = stack2(x, 128, 4, name='conv3')
    x = stack2(x, 256, 6, name='conv4')
    x = stack2(x, 512, 3, stride1=1, name='conv5')

    if preact:
        x = layers.BatchNormalization(name='post_bn')(x)
        x = layers.Activation('relu', name='post_relu')(x)
    if include_top:
        x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
        x = layers.Dense(classes, activation=classifier_activation, name='predictions')(x)
    else:
        if pooling == 'avg':
            x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
        elif pooling == 'max':
            x = layers.GlobalMaxPooling2D(name='max_pool')(x)

    model = Model(img_input, x)
    return model
if __name__ == '__main__':
    model = ResNet50V2(input_shape=(224, 224, 3))
    model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D)       (None, 114, 114, 64) 0           conv1_conv[0][0]                 
__________________________________________________________________________________________________
pool1_pool (MaxPooling2D)       (None, 56, 56, 64)   0           pool1_pad[0][0]                  
__________________________________________________________________________________________________
conv2_block1_preact_bn (BatchNo (None, 56, 56, 64)   256         pool1_pool[0][0]                 
__________________________________________________________________________________________________
conv2_block1_preact_relu (Activ (None, 56, 56, 64)   0           conv2_block1_preact_bn[0][0]     
__________________________________________________________________________________________________
conv2_block1_1_conv (Conv2D)    (None, 56, 56, 64)   4096        conv2_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv2_block1_1_bn (BatchNormali (None, 56, 56, 64)   256         conv2_block1_1_conv[0][0]        
__________________________________________________________________________________________________
conv2_block1_1_relu (Activation (None, 56, 56, 64)   0           conv2_block1_1_bn[0][0]          
__________________________________________________________________________________________________
conv2_block1_2_pad (ZeroPadding (None, 58, 58, 64)   0           conv2_block1_1_relu[0][0]        
__________________________________________________________________________________________________
conv2_block1_2_conv (Conv2D)    (None, 56, 56, 64)   36864       conv2_block1_2_pad[0][0]         
__________________________________________________________________________________________________
conv2_block1_2_bn (BatchNormali (None, 56, 56, 64)   256         conv2_block1_2_conv[0][0]        
__________________________________________________________________________________________________
conv2_block1_2_relu (Activation (None, 56, 56, 64)   0           conv2_block1_2_bn[0][0]          
__________________________________________________________________________________________________
conv2_block1_0_conv (Conv2D)    (None, 56, 56, 256)  16640       conv2_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv2_block1_3_conv (Conv2D)    (None, 56, 56, 256)  16640       conv2_block1_2_relu[0][0]        
__________________________________________________________________________________________________
conv2_block1_out (Add)          (None, 56, 56, 256)  0           conv2_block1_0_conv[0][0]        
                                                                 conv2_block1_3_conv[0][0]        
__________________________________________________________________________________________________
conv2_block2_preact_bn (BatchNo (None, 56, 56, 256)  1024        conv2_block1_out[0][0]           
__________________________________________________________________________________________________
conv2_block2_preact_relu (Activ (None, 56, 56, 256)  0           conv2_block2_preact_bn[0][0]     
__________________________________________________________________________________________________
conv2_block2_1_conv (Conv2D)    (None, 56, 56, 64)   16384       conv2_block2_preact_relu[0][0]   
__________________________________________________________________________________________________
conv2_block2_1_bn (BatchNormali (None, 56, 56, 64)   256         conv2_block2_1_conv[0][0]        
__________________________________________________________________________________________________
conv2_block2_1_relu (Activation (None, 56, 56, 64)   0           conv2_block2_1_bn[0][0]          
__________________________________________________________________________________________________
conv2_block2_2_pad (ZeroPadding (None, 58, 58, 64)   0           conv2_block2_1_relu[0][0]        
__________________________________________________________________________________________________
conv2_block2_2_conv (Conv2D)    (None, 56, 56, 64)   36864       conv2_block2_2_pad[0][0]         
__________________________________________________________________________________________________
conv2_block2_2_bn (BatchNormali (None, 56, 56, 64)   256         conv2_block2_2_conv[0][0]        
__________________________________________________________________________________________________
conv2_block2_2_relu (Activation (None, 56, 56, 64)   0           conv2_block2_2_bn[0][0]          
__________________________________________________________________________________________________
conv2_block2_3_conv (Conv2D)    (None, 56, 56, 256)  16640       conv2_block2_2_relu[0][0]        
__________________________________________________________________________________________________
conv2_block2_out (Add)          (None, 56, 56, 256)  0           conv2_block1_out[0][0]           
                                                                 conv2_block2_3_conv[0][0]        
__________________________________________________________________________________________________
conv2_block3_preact_bn (BatchNo (None, 56, 56, 256)  1024        conv2_block2_out[0][0]           
__________________________________________________________________________________________________
conv2_block3_preact_relu (Activ (None, 56, 56, 256)  0           conv2_block3_preact_bn[0][0]     
__________________________________________________________________________________________________
conv2_block3_1_conv (Conv2D)    (None, 56, 56, 64)   16384       conv2_block3_preact_relu[0][0]   
__________________________________________________________________________________________________
conv2_block3_1_bn (BatchNormali (None, 56, 56, 64)   256         conv2_block3_1_conv[0][0]        
__________________________________________________________________________________________________
conv2_block3_1_relu (Activation (None, 56, 56, 64)   0           conv2_block3_1_bn[0][0]          
__________________________________________________________________________________________________
conv2_block3_2_pad (ZeroPadding (None, 58, 58, 64)   0           conv2_block3_1_relu[0][0]        
__________________________________________________________________________________________________
conv2_block3_2_conv (Conv2D)    (None, 28, 28, 64)   36864       conv2_block3_2_pad[0][0]         
__________________________________________________________________________________________________
conv2_block3_2_bn (BatchNormali (None, 28, 28, 64)   256         conv2_block3_2_conv[0][0]        
__________________________________________________________________________________________________
conv2_block3_2_relu (Activation (None, 28, 28, 64)   0           conv2_block3_2_bn[0][0]          
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 28, 28, 256)  0           conv2_block2_out[0][0]           
__________________________________________________________________________________________________
conv2_block3_3_conv (Conv2D)    (None, 28, 28, 256)  16640       conv2_block3_2_relu[0][0]        
__________________________________________________________________________________________________
conv2_block3_out (Add)          (None, 28, 28, 256)  0           max_pooling2d[0][0]              
                                                                 conv2_block3_3_conv[0][0]        
__________________________________________________________________________________________________
conv3_block1_preact_bn (BatchNo (None, 28, 28, 256)  1024        conv2_block3_out[0][0]           
__________________________________________________________________________________________________
conv3_block1_preact_relu (Activ (None, 28, 28, 256)  0           conv3_block1_preact_bn[0][0]     
__________________________________________________________________________________________________
conv3_block1_1_conv (Conv2D)    (None, 28, 28, 128)  32768       conv3_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv3_block1_1_bn (BatchNormali (None, 28, 28, 128)  512         conv3_block1_1_conv[0][0]        
__________________________________________________________________________________________________
conv3_block1_1_relu (Activation (None, 28, 28, 128)  0           conv3_block1_1_bn[0][0]          
__________________________________________________________________________________________________
conv3_block1_2_pad (ZeroPadding (None, 30, 30, 128)  0           conv3_block1_1_relu[0][0]        
__________________________________________________________________________________________________
conv3_block1_2_conv (Conv2D)    (None, 28, 28, 128)  147456      conv3_block1_2_pad[0][0]         
__________________________________________________________________________________________________
conv3_block1_2_bn (BatchNormali (None, 28, 28, 128)  512         conv3_block1_2_conv[0][0]        
__________________________________________________________________________________________________
conv3_block1_2_relu (Activation (None, 28, 28, 128)  0           conv3_block1_2_bn[0][0]          
__________________________________________________________________________________________________
conv3_block1_0_conv (Conv2D)    (None, 28, 28, 512)  131584      conv3_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv3_block1_3_conv (Conv2D)    (None, 28, 28, 512)  66048       conv3_block1_2_relu[0][0]        
__________________________________________________________________________________________________
conv3_block1_out (Add)          (None, 28, 28, 512)  0           conv3_block1_0_conv[0][0]        
                                                                 conv3_block1_3_conv[0][0]        
__________________________________________________________________________________________________
conv3_block2_preact_bn (BatchNo (None, 28, 28, 512)  2048        conv3_block1_out[0][0]           
__________________________________________________________________________________________________
conv3_block2_preact_relu (Activ (None, 28, 28, 512)  0           conv3_block2_preact_bn[0][0]     
__________________________________________________________________________________________________
conv3_block2_1_conv (Conv2D)    (None, 28, 28, 128)  65536       conv3_block2_preact_relu[0][0]   
__________________________________________________________________________________________________
conv3_block2_1_bn (BatchNormali (None, 28, 28, 128)  512         conv3_block2_1_conv[0][0]        
__________________________________________________________________________________________________
conv3_block2_1_relu (Activation (None, 28, 28, 128)  0           conv3_block2_1_bn[0][0]          
__________________________________________________________________________________________________
conv3_block2_2_pad (ZeroPadding (None, 30, 30, 128)  0           conv3_block2_1_relu[0][0]        
__________________________________________________________________________________________________
conv3_block2_2_conv (Conv2D)    (None, 28, 28, 128)  147456      conv3_block2_2_pad[0][0]         
__________________________________________________________________________________________________
conv3_block2_2_bn (BatchNormali (None, 28, 28, 128)  512         conv3_block2_2_conv[0][0]        
__________________________________________________________________________________________________
conv3_block2_2_relu (Activation (None, 28, 28, 128)  0           conv3_block2_2_bn[0][0]          
__________________________________________________________________________________________________
conv3_block2_3_conv (Conv2D)    (None, 28, 28, 512)  66048       conv3_block2_2_relu[0][0]        
__________________________________________________________________________________________________
conv3_block2_out (Add)          (None, 28, 28, 512)  0           conv3_block1_out[0][0]           
                                                                 conv3_block2_3_conv[0][0]        
__________________________________________________________________________________________________
conv3_block3_preact_bn (BatchNo (None, 28, 28, 512)  2048        conv3_block2_out[0][0]           
__________________________________________________________________________________________________
conv3_block3_preact_relu (Activ (None, 28, 28, 512)  0           conv3_block3_preact_bn[0][0]     
__________________________________________________________________________________________________
conv3_block3_1_conv (Conv2D)    (None, 28, 28, 128)  65536       conv3_block3_preact_relu[0][0]   
__________________________________________________________________________________________________
conv3_block3_1_bn (BatchNormali (None, 28, 28, 128)  512         conv3_block3_1_conv[0][0]        
__________________________________________________________________________________________________
conv3_block3_1_relu (Activation (None, 28, 28, 128)  0           conv3_block3_1_bn[0][0]          
__________________________________________________________________________________________________
conv3_block3_2_pad (ZeroPadding (None, 30, 30, 128)  0           conv3_block3_1_relu[0][0]        
__________________________________________________________________________________________________
conv3_block3_2_conv (Conv2D)    (None, 28, 28, 128)  147456      conv3_block3_2_pad[0][0]         
__________________________________________________________________________________________________
conv3_block3_2_bn (BatchNormali (None, 28, 28, 128)  512         conv3_block3_2_conv[0][0]        
__________________________________________________________________________________________________
conv3_block3_2_relu (Activation (None, 28, 28, 128)  0           conv3_block3_2_bn[0][0]          
__________________________________________________________________________________________________
conv3_block3_3_conv (Conv2D)    (None, 28, 28, 512)  66048       conv3_block3_2_relu[0][0]        
__________________________________________________________________________________________________
conv3_block3_out (Add)          (None, 28, 28, 512)  0           conv3_block2_out[0][0]           
                                                                 conv3_block3_3_conv[0][0]        
__________________________________________________________________________________________________
conv3_block4_preact_bn (BatchNo (None, 28, 28, 512)  2048        conv3_block3_out[0][0]           
__________________________________________________________________________________________________
conv3_block4_preact_relu (Activ (None, 28, 28, 512)  0           conv3_block4_preact_bn[0][0]     
__________________________________________________________________________________________________
conv3_block4_1_conv (Conv2D)    (None, 28, 28, 128)  65536       conv3_block4_preact_relu[0][0]   
__________________________________________________________________________________________________
conv3_block4_1_bn (BatchNormali (None, 28, 28, 128)  512         conv3_block4_1_conv[0][0]        
__________________________________________________________________________________________________
conv3_block4_1_relu (Activation (None, 28, 28, 128)  0           conv3_block4_1_bn[0][0]          
__________________________________________________________________________________________________
conv3_block4_2_pad (ZeroPadding (None, 30, 30, 128)  0           conv3_block4_1_relu[0][0]        
__________________________________________________________________________________________________
conv3_block4_2_conv (Conv2D)    (None, 14, 14, 128)  147456      conv3_block4_2_pad[0][0]         
__________________________________________________________________________________________________
conv3_block4_2_bn (BatchNormali (None, 14, 14, 128)  512         conv3_block4_2_conv[0][0]        
__________________________________________________________________________________________________
conv3_block4_2_relu (Activation (None, 14, 14, 128)  0           conv3_block4_2_bn[0][0]          
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 14, 14, 512)  0           conv3_block3_out[0][0]           
__________________________________________________________________________________________________
conv3_block4_3_conv (Conv2D)    (None, 14, 14, 512)  66048       conv3_block4_2_relu[0][0]        
__________________________________________________________________________________________________
conv3_block4_out (Add)          (None, 14, 14, 512)  0           max_pooling2d_1[0][0]            
                                                                 conv3_block4_3_conv[0][0]        
__________________________________________________________________________________________________
conv4_block1_preact_bn (BatchNo (None, 14, 14, 512)  2048        conv3_block4_out[0][0]           
__________________________________________________________________________________________________
conv4_block1_preact_relu (Activ (None, 14, 14, 512)  0           conv4_block1_preact_bn[0][0]     
__________________________________________________________________________________________________
conv4_block1_1_conv (Conv2D)    (None, 14, 14, 256)  131072      conv4_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv4_block1_1_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block1_1_conv[0][0]        
__________________________________________________________________________________________________
conv4_block1_1_relu (Activation (None, 14, 14, 256)  0           conv4_block1_1_bn[0][0]          
__________________________________________________________________________________________________
conv4_block1_2_pad (ZeroPadding (None, 16, 16, 256)  0           conv4_block1_1_relu[0][0]        
__________________________________________________________________________________________________
conv4_block1_2_conv (Conv2D)    (None, 14, 14, 256)  589824      conv4_block1_2_pad[0][0]         
__________________________________________________________________________________________________
conv4_block1_2_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block1_2_conv[0][0]        
__________________________________________________________________________________________________
conv4_block1_2_relu (Activation (None, 14, 14, 256)  0           conv4_block1_2_bn[0][0]          
__________________________________________________________________________________________________
conv4_block1_0_conv (Conv2D)    (None, 14, 14, 1024) 525312      conv4_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv4_block1_3_conv (Conv2D)    (None, 14, 14, 1024) 263168      conv4_block1_2_relu[0][0]        
__________________________________________________________________________________________________
conv4_block1_out (Add)          (None, 14, 14, 1024) 0           conv4_block1_0_conv[0][0]        
                                                                 conv4_block1_3_conv[0][0]        
__________________________________________________________________________________________________
conv4_block2_preact_bn (BatchNo (None, 14, 14, 1024) 4096        conv4_block1_out[0][0]           
__________________________________________________________________________________________________
conv4_block2_preact_relu (Activ (None, 14, 14, 1024) 0           conv4_block2_preact_bn[0][0]     
__________________________________________________________________________________________________
conv4_block2_1_conv (Conv2D)    (None, 14, 14, 256)  262144      conv4_block2_preact_relu[0][0]   
__________________________________________________________________________________________________
conv4_block2_1_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block2_1_conv[0][0]        
__________________________________________________________________________________________________
conv4_block2_1_relu (Activation (None, 14, 14, 256)  0           conv4_block2_1_bn[0][0]          
__________________________________________________________________________________________________
conv4_block2_2_pad (ZeroPadding (None, 16, 16, 256)  0           conv4_block2_1_relu[0][0]        
__________________________________________________________________________________________________
conv4_block2_2_conv (Conv2D)    (None, 14, 14, 256)  589824      conv4_block2_2_pad[0][0]         
__________________________________________________________________________________________________
conv4_block2_2_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block2_2_conv[0][0]        
__________________________________________________________________________________________________
conv4_block2_2_relu (Activation (None, 14, 14, 256)  0           conv4_block2_2_bn[0][0]          
__________________________________________________________________________________________________
conv4_block2_3_conv (Conv2D)    (None, 14, 14, 1024) 263168      conv4_block2_2_relu[0][0]        
__________________________________________________________________________________________________
conv4_block2_out (Add)          (None, 14, 14, 1024) 0           conv4_block1_out[0][0]           
                                                                 conv4_block2_3_conv[0][0]        
__________________________________________________________________________________________________
conv4_block3_preact_bn (BatchNo (None, 14, 14, 1024) 4096        conv4_block2_out[0][0]           
__________________________________________________________________________________________________
conv4_block3_preact_relu (Activ (None, 14, 14, 1024) 0           conv4_block3_preact_bn[0][0]     
__________________________________________________________________________________________________
conv4_block3_1_conv (Conv2D)    (None, 14, 14, 256)  262144      conv4_block3_preact_relu[0][0]   
__________________________________________________________________________________________________
conv4_block3_1_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block3_1_conv[0][0]        
__________________________________________________________________________________________________
conv4_block3_1_relu (Activation (None, 14, 14, 256)  0           conv4_block3_1_bn[0][0]          
__________________________________________________________________________________________________
conv4_block3_2_pad (ZeroPadding (None, 16, 16, 256)  0           conv4_block3_1_relu[0][0]        
__________________________________________________________________________________________________
conv4_block3_2_conv (Conv2D)    (None, 14, 14, 256)  589824      conv4_block3_2_pad[0][0]         
__________________________________________________________________________________________________
conv4_block3_2_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block3_2_conv[0][0]        
__________________________________________________________________________________________________
conv4_block3_2_relu (Activation (None, 14, 14, 256)  0           conv4_block3_2_bn[0][0]          
__________________________________________________________________________________________________
conv4_block3_3_conv (Conv2D)    (None, 14, 14, 1024) 263168      conv4_block3_2_relu[0][0]        
__________________________________________________________________________________________________
conv4_block3_out (Add)          (None, 14, 14, 1024) 0           conv4_block2_out[0][0]           
                                                                 conv4_block3_3_conv[0][0]        
__________________________________________________________________________________________________
conv4_block4_preact_bn (BatchNo (None, 14, 14, 1024) 4096        conv4_block3_out[0][0]           
__________________________________________________________________________________________________
conv4_block4_preact_relu (Activ (None, 14, 14, 1024) 0           conv4_block4_preact_bn[0][0]     
__________________________________________________________________________________________________
conv4_block4_1_conv (Conv2D)    (None, 14, 14, 256)  262144      conv4_block4_preact_relu[0][0]   
__________________________________________________________________________________________________
conv4_block4_1_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block4_1_conv[0][0]        
__________________________________________________________________________________________________
conv4_block4_1_relu (Activation (None, 14, 14, 256)  0           conv4_block4_1_bn[0][0]          
__________________________________________________________________________________________________
conv4_block4_2_pad (ZeroPadding (None, 16, 16, 256)  0           conv4_block4_1_relu[0][0]        
__________________________________________________________________________________________________
conv4_block4_2_conv (Conv2D)    (None, 14, 14, 256)  589824      conv4_block4_2_pad[0][0]         
__________________________________________________________________________________________________
conv4_block4_2_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block4_2_conv[0][0]        
__________________________________________________________________________________________________
conv4_block4_2_relu (Activation (None, 14, 14, 256)  0           conv4_block4_2_bn[0][0]          
__________________________________________________________________________________________________
conv4_block4_3_conv (Conv2D)    (None, 14, 14, 1024) 263168      conv4_block4_2_relu[0][0]        
__________________________________________________________________________________________________
conv4_block4_out (Add)          (None, 14, 14, 1024) 0           conv4_block3_out[0][0]           
                                                                 conv4_block4_3_conv[0][0]        
__________________________________________________________________________________________________
conv4_block5_preact_bn (BatchNo (None, 14, 14, 1024) 4096        conv4_block4_out[0][0]           
__________________________________________________________________________________________________
conv4_block5_preact_relu (Activ (None, 14, 14, 1024) 0           conv4_block5_preact_bn[0][0]     
__________________________________________________________________________________________________
conv4_block5_1_conv (Conv2D)    (None, 14, 14, 256)  262144      conv4_block5_preact_relu[0][0]   
__________________________________________________________________________________________________
conv4_block5_1_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block5_1_conv[0][0]        
__________________________________________________________________________________________________
conv4_block5_1_relu (Activation (None, 14, 14, 256)  0           conv4_block5_1_bn[0][0]          
__________________________________________________________________________________________________
conv4_block5_2_pad (ZeroPadding (None, 16, 16, 256)  0           conv4_block5_1_relu[0][0]        
__________________________________________________________________________________________________
conv4_block5_2_conv (Conv2D)    (None, 14, 14, 256)  589824      conv4_block5_2_pad[0][0]         
__________________________________________________________________________________________________
conv4_block5_2_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block5_2_conv[0][0]        
__________________________________________________________________________________________________
conv4_block5_2_relu (Activation (None, 14, 14, 256)  0           conv4_block5_2_bn[0][0]          
__________________________________________________________________________________________________
conv4_block5_3_conv (Conv2D)    (None, 14, 14, 1024) 263168      conv4_block5_2_relu[0][0]        
__________________________________________________________________________________________________
conv4_block5_out (Add)          (None, 14, 14, 1024) 0           conv4_block4_out[0][0]           
                                                                 conv4_block5_3_conv[0][0]        
__________________________________________________________________________________________________
conv4_block6_preact_bn (BatchNo (None, 14, 14, 1024) 4096        conv4_block5_out[0][0]           
__________________________________________________________________________________________________
conv4_block6_preact_relu (Activ (None, 14, 14, 1024) 0           conv4_block6_preact_bn[0][0]     
__________________________________________________________________________________________________
conv4_block6_1_conv (Conv2D)    (None, 14, 14, 256)  262144      conv4_block6_preact_relu[0][0]   
__________________________________________________________________________________________________
conv4_block6_1_bn (BatchNormali (None, 14, 14, 256)  1024        conv4_block6_1_conv[0][0]        
__________________________________________________________________________________________________
conv4_block6_1_relu (Activation (None, 14, 14, 256)  0           conv4_block6_1_bn[0][0]          
__________________________________________________________________________________________________
conv4_block6_2_pad (ZeroPadding (None, 16, 16, 256)  0           conv4_block6_1_relu[0][0]        
__________________________________________________________________________________________________
conv4_block6_2_conv (Conv2D)    (None, 7, 7, 256)    589824      conv4_block6_2_pad[0][0]         
__________________________________________________________________________________________________
conv4_block6_2_bn (BatchNormali (None, 7, 7, 256)    1024        conv4_block6_2_conv[0][0]        
__________________________________________________________________________________________________
conv4_block6_2_relu (Activation (None, 7, 7, 256)    0           conv4_block6_2_bn[0][0]          
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 7, 7, 1024)   0           conv4_block5_out[0][0]           
__________________________________________________________________________________________________
conv4_block6_3_conv (Conv2D)    (None, 7, 7, 1024)   263168      conv4_block6_2_relu[0][0]        
__________________________________________________________________________________________________
conv4_block6_out (Add)          (None, 7, 7, 1024)   0           max_pooling2d_2[0][0]            
                                                                 conv4_block6_3_conv[0][0]        
__________________________________________________________________________________________________
conv5_block1_preact_bn (BatchNo (None, 7, 7, 1024)   4096        conv4_block6_out[0][0]           
__________________________________________________________________________________________________
conv5_block1_preact_relu (Activ (None, 7, 7, 1024)   0           conv5_block1_preact_bn[0][0]     
__________________________________________________________________________________________________
conv5_block1_1_conv (Conv2D)    (None, 7, 7, 512)    524288      conv5_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv5_block1_1_bn (BatchNormali (None, 7, 7, 512)    2048        conv5_block1_1_conv[0][0]        
__________________________________________________________________________________________________
conv5_block1_1_relu (Activation (None, 7, 7, 512)    0           conv5_block1_1_bn[0][0]          
__________________________________________________________________________________________________
conv5_block1_2_pad (ZeroPadding (None, 9, 9, 512)    0           conv5_block1_1_relu[0][0]        
__________________________________________________________________________________________________
conv5_block1_2_conv (Conv2D)    (None, 7, 7, 512)    2359296     conv5_block1_2_pad[0][0]         
__________________________________________________________________________________________________
conv5_block1_2_bn (BatchNormali (None, 7, 7, 512)    2048        conv5_block1_2_conv[0][0]        
__________________________________________________________________________________________________
conv5_block1_2_relu (Activation (None, 7, 7, 512)    0           conv5_block1_2_bn[0][0]          
__________________________________________________________________________________________________
conv5_block1_0_conv (Conv2D)    (None, 7, 7, 2048)   2099200     conv5_block1_preact_relu[0][0]   
__________________________________________________________________________________________________
conv5_block1_3_conv (Conv2D)    (None, 7, 7, 2048)   1050624     conv5_block1_2_relu[0][0]        
__________________________________________________________________________________________________
conv5_block1_out (Add)          (None, 7, 7, 2048)   0           conv5_block1_0_conv[0][0]        
                                                                 conv5_block1_3_conv[0][0]        
__________________________________________________________________________________________________
conv5_block2_preact_bn (BatchNo (None, 7, 7, 2048)   8192        conv5_block1_out[0][0]           
__________________________________________________________________________________________________
conv5_block2_preact_relu (Activ (None, 7, 7, 2048)   0           conv5_block2_preact_bn[0][0]     
__________________________________________________________________________________________________
conv5_block2_1_conv (Conv2D)    (None, 7, 7, 512)    1048576     conv5_block2_preact_relu[0][0]   
__________________________________________________________________________________________________
conv5_block2_1_bn (BatchNormali (None, 7, 7, 512)    2048        conv5_block2_1_conv[0][0]        
__________________________________________________________________________________________________
conv5_block2_1_relu (Activation (None, 7, 7, 512)    0           conv5_block2_1_bn[0][0]          
__________________________________________________________________________________________________
conv5_block2_2_pad (ZeroPadding (None, 9, 9, 512)    0           conv5_block2_1_relu[0][0]        
__________________________________________________________________________________________________
conv5_block2_2_conv (Conv2D)    (None, 7, 7, 512)    2359296     conv5_block2_2_pad[0][0]         
__________________________________________________________________________________________________
conv5_block2_2_bn (BatchNormali (None, 7, 7, 512)    2048        conv5_block2_2_conv[0][0]        
__________________________________________________________________________________________________
conv5_block2_2_relu (Activation (None, 7, 7, 512)    0           conv5_block2_2_bn[0][0]          
__________________________________________________________________________________________________
conv5_block2_3_conv (Conv2D)    (None, 7, 7, 2048)   1050624     conv5_block2_2_relu[0][0]        
__________________________________________________________________________________________________
conv5_block2_out (Add)          (None, 7, 7, 2048)   0           conv5_block1_out[0][0]           
                                                                 conv5_block2_3_conv[0][0]        
__________________________________________________________________________________________________
conv5_block3_preact_bn (BatchNo (None, 7, 7, 2048)   8192        conv5_block2_out[0][0]           
__________________________________________________________________________________________________
conv5_block3_preact_relu (Activ (None, 7, 7, 2048)   0           conv5_block3_preact_bn[0][0]     
__________________________________________________________________________________________________
conv5_block3_1_conv (Conv2D)    (None, 7, 7, 512)    1048576     conv5_block3_preact_relu[0][0]   
__________________________________________________________________________________________________
conv5_block3_1_bn (BatchNormali (None, 7, 7, 512)    2048        conv5_block3_1_conv[0][0]        
__________________________________________________________________________________________________
conv5_block3_1_relu (Activation (None, 7, 7, 512)    0           conv5_block3_1_bn[0][0]          
__________________________________________________________________________________________________
conv5_block3_2_pad (ZeroPadding (None, 9, 9, 512)    0           conv5_block3_1_relu[0][0]        
__________________________________________________________________________________________________
conv5_block3_2_conv (Conv2D)    (None, 7, 7, 512)    2359296     conv5_block3_2_pad[0][0]         
__________________________________________________________________________________________________
conv5_block3_2_bn (BatchNormali (None, 7, 7, 512)    2048        conv5_block3_2_conv[0][0]        
__________________________________________________________________________________________________
conv5_block3_2_relu (Activation (None, 7, 7, 512)    0           conv5_block3_2_bn[0][0]          
__________________________________________________________________________________________________
conv5_block3_3_conv (Conv2D)    (None, 7, 7, 2048)   1050624     conv5_block3_2_relu[0][0]        
__________________________________________________________________________________________________
conv5_block3_out (Add)          (None, 7, 7, 2048)   0           conv5_block2_out[0][0]           
                                                                 conv5_block3_3_conv[0][0]        
__________________________________________________________________________________________________
post_bn (BatchNormalization)    (None, 7, 7, 2048)   8192        conv5_block3_out[0][0]           
__________________________________________________________________________________________________
post_relu (Activation)          (None, 7, 7, 2048)   0           post_bn[0][0]                    
__________________________________________________________________________________________________
avg_pool (GlobalAveragePooling2 (None, 2048)         0           post_relu[0][0]                  
__________________________________________________________________________________________________
predictions (Dense)             (None, 1000)         2049000     avg_pool[0][0]                   
==================================================================================================
Total params: 25,613,800
Trainable params: 25,568,360
Non-trainable params: 45,440

4.4 ResNet101V2架构复现

  其实和上面的搭建方式完全一样,没必要再写了,但是为了对比还是写一下。
def ResNet101V2(include_top=True,  # 是否包含位于网络顶部的全连接层
                preact=True,  # 是否使用预激活
                use_bias=True,  # 是否对卷积层使用偏置
                weights='imagenet',
                input_tensor=None,  # 可选的keras张量,用作模型的图像输入
                input_shape=None,
                pooling=None,
                classes=1000,  # 用于分类图像的可选类数
                classifier_activation='softmax'):  # 分类层激活函数
    img_input = layers.Input(shape=input_shape)
    x = layers.ZeroPadding2D(padding=((3, 3), (3, 3)), name='conv1_pad')(img_input)
    x = layers.Conv2D(64, 7, strides=2, use_bias=use_bias, name='conv1_conv')(x)

    if not preact:
        x = layers.BatchNormalization(name='conv1_bn')(x)
        x = layers.Activation('relu', name='conv1_relu')(x)

    x = layers.ZeroPadding2D(padding=((1, 1), (1, 1)), name='pool1_pad')(x)
    x = layers.MaxPooling2D(3, strides=2, name='pool1_pool')(x)

    x = stack2(x, 64, 3, name='conv2')
    x = stack2(x, 128, 4, name='conv3')
    x = stack2(x, 256, 23, name='conv4')
    x = stack2(x, 512, 3, stride1=1, name='conv5')

    if preact:
        x = layers.BatchNormalization(name='post_bn')(x)
        x = layers.Activation('relu', name='post_relu')(x)
    if include_top:
        x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
        x = layers.Dense(classes, activation=classifier_activation, name='predictions')(x)
    else:
        if pooling == 'avg':
            x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
        elif pooling == 'max':
            x = layers.GlobalMaxPooling2D(name='max_pool')(x)

    model = Model(img_input, x)
    return model

4.5 ResNet152V2模型复现

只需要将堆叠快部分的代码改为如下即可。

    x = stack2(x, 64, 3, name='conv2')
    x = stack2(x, 128, 8, name='conv3')
    x = stack2(x, 256, 36, name='conv4')
    x = stack2(x, 512, 3, stride1=1, name='conv5')

4.6 ResNet50V2模型结构大图

image-20220831155807570

4.7 ResNet101V2模型结构大图

image-20220831155834220

4.8 ResNet152模型结构大图

References

Identity Mappings in Deep Residual Networks

Deep Residual Learning for Image Recognition
https://github.com/tensorflow/tensorflow/blob/b36436b087bd8e8701ef51718179037cccdfc26e/tensorflow/python/keras/applications/resnet.py#L293

目录
相关文章
|
机器学习/深度学习 算法 计算机视觉
yolov8人脸识别-脸部关键点检测(代码+原理)
yolov8人脸识别-脸部关键点检测(代码+原理)
|
7月前
|
机器学习/深度学习 人工智能
OmniCam:浙大联合上海交大推出多模态视频生成框架,虚拟导演打造百万级影视运镜
OmniCam是由浙江大学与上海交通大学联合研发的多模态视频生成框架,通过LLM与视频扩散模型结合实现高质量视频生成,支持文本、轨迹和图像等多种输入模态。
156 1
OmniCam:浙大联合上海交大推出多模态视频生成框架,虚拟导演打造百万级影视运镜
|
6月前
|
自然语言处理 IDE 开发工具
5分钟完成手势识别项目!CodeBuddy的Craft模式让传统编程方法沦为古董?
本文介绍了使用CodeBuddy快速开发手势识别程序的方法。首先安装Python 3.9.13并配置VS Code环境,接着通过pip安装依赖库`mediapipe`和`opencv-python`。利用CodeBuddy的Craft模式,仅需输入自然语言描述即可生成基础代码,经过简单调整后即可运行。代码实现了四种手势识别(OK、竖大拇指、握拳、张开手掌),并通过摄像头实时展示结果。尽管电脑摄像头像素较低,但识别效果良好。本文旨在帮助读者了解CodeBuddy的强大功能,并激发更多创意应用。
412 20
|
机器学习/深度学习 编解码 算法框架/工具
经典神经网络论文超详细解读(二)——VGGNet学习笔记(翻译+精读)
经典神经网络论文超详细解读(二)——VGGNet学习笔记(翻译+精读)
636 1
经典神经网络论文超详细解读(二)——VGGNet学习笔记(翻译+精读)
|
移动开发 JavaScript API
Sprunki Game 实现技术分析及介绍
**Sprunki** 是一款基于音乐创作的游戏,作为经典游戏 **Incredibox** 的粉丝改版,它采用 HTML5 和 JavaScript 构建,通过拖拽式 UI 和模块化声音系统,提供了一个创意十足的音乐创作平台。游戏支持多种设备,并融入了 CSS3 动画和 Web Audio API,增强视觉与音效同步。玩家还可以通过社交媒体分享作品,参与社区互动。Sprunki 不仅是一款游戏,更是一个开放的创作平台。
|
JavaScript 前端开发 开发者
深入理解JavaScript原型链:从基础到进阶
【10月更文挑战第13天】深入理解JavaScript原型链:从基础到进阶
270 0
|
运维 Devops
全球与中国DevOps市场现状及未来发展趋势
本文研究全球及中国市场DevOps现状及未来发展趋势,侧重分析全球及中国市场的主要企业,同时对比北美、欧洲、中国、日本、东南亚和印度等地区的现状及未来发展趋势
全球与中国DevOps市场现状及未来发展趋势
|
SQL 存储 JSON
MySQL 8.0新特性
MySQL 8.0新特性
995 0
|
机器学习/深度学习 人工智能 PyTorch
极智AI | 教你使用深度学习模型调试器polygraphy
大家好,我是极智视界,本文讲解一下 深度学习模型调试器 polygraphy 的使用方法。
1006 0
|
域名解析
npm i 安装依赖卡慢,失败,等很久,不成功,错误等等
 在安装依赖的时候,有时候npm i会很慢,甚至不成功,错误等等,其中原因之一就是直接使用npm外国源、国内访问国外源网站的网络肯定慢,所以卡顿
2561 0
npm i 安装依赖卡慢,失败,等很久,不成功,错误等等