训练阶段yolov7主干部分结构图

简介: 笔记

通过分析yaml文件,并将训练阶段的yolov7结构转onnx可视化以后,用Visio画了一下主干和SPPCSPC的结构图,其他结构部分后面有时间了再细画。【注意这里强调了是训练阶段,不是预测阶段,预测阶段的结构图和训练是不一样的,因为预测阶段采用了重参化结构,可以看我另一篇将RepVGG重构的文章】


根据项目中 yolov7-main\cfg\training\yolov7.yaml绘制主干结构。下面的注释内容是每次for循环得到的输出通道列表

backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [32, 3, 1]],  # 0 conv1(3,32,3,s=1)
   [-1, 1, Conv, [64, 3, 2]],  # 1-P1/2 conv2(32,64,3,s=2)
   [-1, 1, Conv, [64, 3, 1]],  # conv3(64,64,3,1)
   [-1, 1, Conv, [128, 3, 2]],  # 3-P2/4  conv4(64,128,k=3,s=2)
   [-1, 1, Conv, [64, 1, 1]], # conv5(128,64,1,1) 此刻的out_channels=[32,64,64,128,64]*
   [-2, 1, Conv, [64, 1, 1]], # conv6(128,64,1,1)* 这个卷积层就是一个1*1的identity
   [-1, 1, Conv, [64, 3, 1]], # conv7(64,64,3,1)
   [-1, 1, Conv, [64, 3, 1]], # conv8(64,64,3,1)*
   [-1, 1, Conv, [64, 3, 1]], # conv9(64,64,3,1)
   [-1, 1, Conv, [64, 3, 1]], # conv10(64,64,3,1) 此刻的output_channels = [32,64,64,128,64,64,64,64,64,64]*
   [[-1, -3, -5, -6], 1, Concat, [1]], # 取出通道[64,64,64,64], 拼接后256通道 output_channels = [32,64,64,128,64,64,64,64,64,64,256]
   [-1, 1, Conv, [256, 1, 1]],  # 11 conv11(256,256,1,1) output_channels = [32,64,64,128,64,64,64,64,64,64,256,256]
   [-1, 1, MP, []], # maxpooling(k=2,s=2)通道数还是为256
   [-1, 1, Conv, [128, 1, 1]], # conv12(256,128,1,1) output_channels = [32,64,64,128,64,64,64,64,64,64,256,256,256,128]*
   [-3, 1, Conv, [128, 1, 1]], # conv13(256,128,1,1) [32,64,64,128,64,64,64,64,64,64,256,256,256,128,128]
   [-1, 1, Conv, [128, 3, 2]], # conv14(128,128,3,2) [32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128]*
   [[-1, -3], 1, Concat, [1]],  # 16-P3/8  输出256 [32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256]
   [-1, 1, Conv, [128, 1, 1]], # conv15(256,128,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128]*
   [-2, 1, Conv, [128, 1, 1]], # conv16(256,128,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128]*
   [-1, 1, Conv, [128, 3, 1]], # conv17(128,128,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128]
   [-1, 1, Conv, [128, 3, 1]], # conv18(128,128,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128]*
   [-1, 1, Conv, [128, 3, 1]], # conv19(128,128,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128]
   [-1, 1, Conv, [128, 3, 1]], # conv20(128,128,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128]*
   [[-1, -3, -5, -6], 1, Concat, [1]],# 512 [32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512]
   [-1, 1, Conv, [512, 1, 1]],  # 24 conv21(512,512,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512]
   [-1, 1, MP, []],# [32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512]
   [-1, 1, Conv, [256, 1, 1]],# conv22(512,256,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256]*
   [-3, 1, Conv, [256, 1, 1]], # conv23(512,256,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256]
   [-1, 1, Conv, [256, 3, 2]], # conv24(256,256,3,2)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256]*
   [[-1, -3], 1, Concat, [1]],  # 29-P4/16 512[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512]
   [-1, 1, Conv, [256, 1, 1]],# conv25(512,256,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256]*
   [-2, 1, Conv, [256, 1, 1]],# conv26(512,256,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256]*
   [-1, 1, Conv, [256, 3, 1]],# conv27(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256]
   [-1, 1, Conv, [256, 3, 1]],# conv28(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256]*
   [-1, 1, Conv, [256, 3, 1]],# conv29(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256]
   [-1, 1, Conv, [256, 3, 1]],# conv30(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256]*
   [[-1, -3, -5, -6], 1, Concat, [1]],# 1024[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024]
   [-1, 1, Conv, [1024, 1, 1]],  # 37 conv31(1024,1024,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024]
   [-1, 1, MP, []],# 1024 maxpooling(2,2)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024]
   [-1, 1, Conv, [512, 1, 1]],# conv32(1024,512,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512]*
   [-3, 1, Conv, [512, 1, 1]],# conv33(1024,512,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512]
   [-1, 1, Conv, [512, 3, 2]],# conv34(512,512,3,2)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512]*
   [[-1, -3], 1, Concat, [1]],  # 42-P5/32 1024 [32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024]
   [-1, 1, Conv, [256, 1, 1]],# conv35(1024,256,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256]*
   [-2, 1, Conv, [256, 1, 1]],# conv36(1024,256,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256,256]*
   [-1, 1, Conv, [256, 3, 1]],# conv37(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256,256,256]
   [-1, 1, Conv, [256, 3, 1]],# conv38(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256,256,256,256]*
   [-1, 1, Conv, [256, 3, 1]],# conv39(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256,256,256,256,256]
   [-1, 1, Conv, [256, 3, 1]],# conv40(256,256,3,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256,256,256,256,256,256]*
   [[-1, -3, -5, -6], 1, Concat, [1]],# 1024[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256,256,256,256,256,256,1024]
   [-1, 1, Conv, [1024, 1, 1]],  # 50conv41(1024,1024,1,1)[32,64,64,128,64,64,64,64,64,64,256,256,256,128,128,128,256,128,128,128,128,128,128,512,512,512,256,256,256,512,256,256,256,256,256,256,1024,1024,1024,512,512,512,1024,256,256,256,256,256,256,1024,1024]
  ]

网络结构图如下(这可是我一点点看着yaml文件和onnx自己画的哦):可以看到在每个stage中,均有一个1 * 1大小的identity,就和ResNet是类似的,会会将4个部分进行拼接(通道层次上的拼接),因此拼接后的通道数会变成4倍。同时每次concat后又会经过一个K=1,S=1的卷积,再分别有两个分支,其中一个分支是经过MaxPooling,另一个分支就是正常的卷积,再将两个分支进行拼接。

30.png

SSPCSPC网络结构如下:可以看到该结构和SPP是有些相似的,都有1*1 5*5 9*9 13*13的池化层,同时与v3和v4一样,前后也均有卷积,不同的是有一个1*1的残差边。


31.png

这里我将SPPCSPC的onnx结构图也附上。

从onnx图上可以看到有个sigmoid和conv的相乘,其实这个过程就是SiLU激活函数的过程,只是因为转onnx的时候不支持SiLU激活函数,因此需要转化一下。大家就把那部分看为SiLU就可以啦。


32.png

转onnx代码如下:

yaml_file = '../cfg/training/yolov7.yaml'
model = Model(yaml_file)
x = torch.ones(1,3,640,640)
torch.onnx.export(model, x, 'yolov7.onnx', verbose=False,opset_version=11)
目录
相关文章
|
20天前
|
计算机视觉
【论文复现】经典再现:yolov4的主干网络重构(结合Slim-neck by GSConv)
【论文复现】经典再现:yolov4的主干网络重构(结合Slim-neck by GSConv)
78 0
【论文复现】经典再现:yolov4的主干网络重构(结合Slim-neck by GSConv)
|
13天前
|
机器学习/深度学习 编解码 算法
YOLOv5改进 | 主干网络 | 用EfficientNet卷积替换backbone【教程+代码 】
在YOLOv5的GFLOPs计算量中,卷积占了其中大多数的比列,为了减少计算量,研究人员提出了用EfficientNet代替backbone。本文给大家带来的教程是**将原来的主干网络替换为EfficientNet。文章在介绍主要的原理后,将手把手教学如何进行模块的代码添加和修改,并将修改后的完整代码放在文章的最后,方便大家一键运行,小白也可轻松上手实践。以帮助您更好地学习深度学习目标检测YOLO系列的挑战。
|
20天前
|
机器学习/深度学习 算法 文件存储
YOLOv8改进 | 主干篇 | 利用MobileNetV3替换Backbone(轻量化网络结构)
YOLOv8改进 | 主干篇 | 利用MobileNetV3替换Backbone(轻量化网络结构)
300 0
YOLOv8改进 | 主干篇 | 利用MobileNetV3替换Backbone(轻量化网络结构)
|
20天前
|
机器学习/深度学习 测试技术 Ruby
YOLOv5改进 | 主干篇 | 反向残差块网络EMO一种轻量级的CNN架构(附完整代码 + 修改教程)
YOLOv5改进 | 主干篇 | 反向残差块网络EMO一种轻量级的CNN架构(附完整代码 + 修改教程)
161 2
|
11天前
|
机器学习/深度学习 编解码 算法
YOLOv8改进 | 主干网络 | 增加网络结构增强小目标检测能力【独家创新——附结构图】
YOLOv8在小目标检测上存在挑战,因卷积导致信息丢失。本文教程将原网络结构替换为更适合小目标检测的backbone,并提供结构图。通过讲解原理和手把手教学,指导如何修改代码,提供完整代码实现,适合新手实践。文章探讨了大特征图对小目标检测的重要性,如细节保留、定位精度、特征丰富度和上下文信息,并介绍了FPN等方法。YOLOv8流程包括预处理、特征提取、融合和检测。修改后的网络结构增加了上采样和concatenate步骤,以利用更大特征图检测小目标。完整代码和修改后的结构图可在文中链接获取。
|
13天前
|
机器学习/深度学习 算法 计算机视觉
yolo3的实现流程是怎样的?
yolo3的实现流程是怎样的?
|
20天前
|
机器学习/深度学习 算法 Serverless
YoLo_V4模型训练过程
YoLo_V4模型训练过程
27 0
|
20天前
|
固态存储 测试技术 计算机视觉
YOLOv5改进 | 主干篇 | 利用MobileNetV2替换Backbone(轻量化网络结构)
YOLOv5改进 | 主干篇 | 利用MobileNetV2替换Backbone(轻量化网络结构)
230 0
|
20天前
|
机器学习/深度学习 存储 编解码
YOLOv5改进 | 主干篇 | 利用MobileNetV1替换Backbone(轻量化网络结构)
YOLOv5改进 | 主干篇 | 利用MobileNetV1替换Backbone(轻量化网络结构)
141 0
|
20天前
|
机器学习/深度学习 固态存储 测试技术
YOLOv8改进 | 主干篇 | 利用MobileNetV2替换Backbone(轻量化网络结构)
YOLOv8改进 | 主干篇 | 利用MobileNetV2替换Backbone(轻量化网络结构)
222 1

热门文章

最新文章