搜索C2f源码位置并新建C2f类
在项目目录中全局搜索class c2f
即可找到c2f
的源码位置。然后打开源码位置,进行相应修改。源码路径为:ultralytics/nn/modules/block.py
在原文件中直接copy一份c2f
类的源码,然后命名为c2f_Attention
,如下所示:
在不同文件导入新建的C2f类
在ultralytics/nn/modules/block.py
顶部,all
中添加刚才创建的类的名称:c2f_Attention
,如下图所示:
同样需要在ultralytics/nn/modules/__init__.py
文件,相应位置导入刚出创建的c2f_Attention
类。如下图:
还需要在ultralytics/nn/tasks.py
中导入创建的c2f_Attention
类,,如下图:
在parse_model
解析函数中添加C2f类
在ultralytics/nn/tasks.py
的parse_model
解析网络结构的函数中,加入c2f_Attention
类,如下图:
创建新的配置文件c2f_att_yolov8.yaml
在ultralytics/cfg/models/v8
目录下新建c2f_att_yolov8.yaml
配置文件,内容如下:
# Ultralytics YOLO 🚀, AGPL-3.0 license # YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n' # [depth, width, max_channels] n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPs s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPs m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPs l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs # YOLOv8.0n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 3, C2f, [128, True]] - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8 - [-1, 6, C2f_Attention, [256, True]] - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16 - [-1, 6, C2f_Attention, [512, True]] - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32 - [-1, 3, C2f_Attention, [1024, True]] - [-1, 1, SPPF, [1024, 5]] # 9 # YOLOv8.0n head head: - [-1, 1, nn.Upsample, [None, 2, 'nearest']] - [[-1, 6], 1, Concat, [1]] # cat backbone P4 - [-1, 3, C2f, [512]] # 12 - [-1, 1, nn.Upsample, [None, 2, 'nearest']] - [[-1, 4], 1, Concat, [1]] # cat backbone P3 - [-1, 3, C2f, [256]] # 15 (P3/8-small) - [-1, 1, Conv, [256, 3, 2]] - [[-1, 12], 1, Concat, [1]] # cat head P4 - [-1, 3, C2f, [512]] # 18 (P4/16-medium) - [-1, 1, Conv, [512, 3, 2]] - [[-1, 9], 1, Concat, [1]] # cat head P5 - [-1, 3, C2f, [1024]] # 21 (P5/32-large) - [[15, 18, 21], 1, Detect, [nc]] # Detect(P3, P4, P5)
新的c2f_att_yolov8.yaml
配置文件与原yolov8.yaml
文件的对比如下:
在C2f中添加注意力:ShuffleAttention
注意:对于有通道数参数的注意力机制,其输入通道数为其上层的输出通道数。这个注意力添加的位置有关。
在路径ultralytics/nn
下新建注意力模块,ShuffleAttention.py
文件。内容如下:
import numpy as np import torch from torch import nn from torch.nn import init from torch.nn.parameter import Parameter class ShuffleAttention(nn.Module): def __init__(self, channel=512, reduction=16, G=8): super().__init__() self.G = G self.channel = channel self.avg_pool = nn.AdaptiveAvgPool2d(1) self.gn = nn.GroupNorm(channel // (2 * G), channel // (2 * G)) self.cweight = Parameter(torch.zeros(1, channel // (2 * G), 1, 1)) self.cbias = Parameter(torch.ones(1, channel // (2 * G), 1, 1)) self.sweight = Parameter(torch.zeros(1, channel // (2 * G), 1, 1)) self.sbias = Parameter(torch.ones(1, channel // (2 * G), 1, 1)) self.sigmoid = nn.Sigmoid() def init_weights(self): for m in self.modules(): if isinstance(m, nn.Conv2d): init.kaiming_normal_(m.weight, mode='fan_out') if m.bias is not None: init.constant_(m.bias, 0) elif isinstance(m, nn.BatchNorm2d): init.constant_(m.weight, 1) init.constant_(m.bias, 0) elif isinstance(m, nn.Linear): init.normal_(m.weight, std=0.001) if m.bias is not None: init.constant_(m.bias, 0) @staticmethod def channel_shuffle(x, groups): b, c, h, w = x.shape x = x.reshape(b, groups, -1, h, w) x = x.permute(0, 2, 1, 3, 4) # flatten x = x.reshape(b, -1, h, w) return x def forward(self, x): b, c, h, w = x.size() # group into subfeatures x = x.view(b * self.G, -1, h, w) # bs*G,c//G,h,w # channel_split x_0, x_1 = x.chunk(2, dim=1) # bs*G,c//(2*G),h,w # channel attention x_channel = self.avg_pool(x_0) # bs*G,c//(2*G),1,1 x_channel = self.cweight * x_channel + self.cbias # bs*G,c//(2*G),1,1 x_channel = x_0 * self.sigmoid(x_channel) # spatial attention x_spatial = self.gn(x_1) # bs*G,c//(2*G),h,w x_spatial = self.sweight * x_spatial + self.sbias # bs*G,c//(2*G),h,w x_spatial = x_1 * self.sigmoid(x_spatial) # bs*G,c//(2*G),h,w # concatenate along channel axis out = torch.cat([x_channel, x_spatial], dim=1) # bs*G,c//G,h,w out = out.contiguous().view(b, -1, h, w) # channel shuffle out = self.channel_shuffle(out, 2) return out
在ultralytics/nn/tasks.py
中导入,并修改在parse_model
解析网络结构的函数中,添加解析代码:
注意力不同位置添加方法
在ultralytics/nn/modules/block.py
中的c2f_Attention
类中代码相应位置添加注意力机制:
1 . 方式一:在self.cv1
后面添加注意力机制
2.方式二:在self.cv2
后面添加注意力机制
3.方式三:在c2f
的bottleneck
中添加注意力机制,将Bottleneck
类,复制一份,并命名为Bottleneck_Attention
,然后,在Bottleneck_Attention
的cv2后面添加注意力机制,同时修改C2f_Attention
类别中的Bottleneck
为Bottleneck_Attention
。如下图所示:
加载配置文件并训练
加载c2f_att_yolov8.yaml
配置文件,并运行train.py
训练代码:
#coding:utf-8 from ultralytics import YOLO if __name__ == '__main__': model = YOLO('ultralytics/cfg/models/v8/c2f_att_yolov8.yaml') model.load('yolov8n.pt') # loading pretrain weights model.train(data='datasets/TomatoData/data.yaml', epochs=150, batch=2)
注意观察,打印出的网络结构是否正常修改,如下图所示: