暂时未有相关云产品技术能力~
项目实战 | YOLOv5 + Tesseract-OCR 实现车牌号文本识别1. 预期效果先看看预期的效果吧,大概就是这样子的,输入一张图片可以把图片中的车牌号以文本的形式打印出来。目前还比较简陋,以后可以尝试加个PyQt5页面实现更加丰富的功能。2. 整体流程首先训练一个YOLOv5的车牌检测器;然后将车牌切下来;将切下来的部分通过OpenCV进行形态学处理最后通过Tesseract-OCR识别车牌并在控制台上打印。3. 准备数据集这次就不自己标注了,直接找了一个开源的。训练集245 245245张、验证集70 7070张、测试集35 3535张。数据集质量一般。4. 训练YOLOv5模型4.1 下载源码git clone https://github.com/ultralytics/yolov54.2 安装环境pip install -qr requirements.txt4.3 修改配置文件license.yamltrain: D:\Pycharm_Projects\datasets\License\train\images val: D:\Pycharm_Projects\datasets\License\valid\images nc: 2 names: ['license-plate', 'vehicle']4.4 训练模型数据量比较少,直接用yolov5s跑就可以。python train.py --weights yolov5s.pt --cfg yolov5s.yaml --datalicense.yaml --epoch 100 --batch-size 16简单跑了100 100100轮,看着还可以,就直接用了。4.5 测试模型python detect.py --source D:\Pycharm_Projects\datasets\License\valid\images --weights runs\train\exp\weights\best.pt5. 截取车牌python detect.py --source D:\Pycharm_Projects\datasets\License\valid\images --weights runs\train\exp2\weights\best.pt --save-crop --classes 0因为数据集质量原因,有一些图拍摄不是很清晰,所以截取到的车牌也不是很清楚,我这里选了一些相对来说清楚一些的。其实到这里我们就可以通过Tesseract-OCR进行识别了,但是不对图像进行处理就识别的话效果很不好,所以我这里还是选择对车牌进行一些形态学处理。6. 形态学处理这部分也不算完全意义上的形态学处理吧,我并没有使用腐蚀膨胀等操作,只是使用了几个OpenCV的础操作对车牌进行了处理,大家可以对比一下效果。(其实还有很大的优化空间的)def Corver_Gray(image_path): # 读取模板图像 img = cv2.imread(image_path) # 转换为灰度图 也可读取时直接转换 ref = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # 二值图像 ref = cv2.threshold(ref, 60, 255, cv2.THRESH_BINARY_INV)[1] return ref7. Tesseract-OCR安装7.1 下载 Tesseract-OCR下载地址:Tesseract-OCR我下载的是最下面的版本,下载好后直接安装就可以,没有什么坑。7.2 配置环境变量7.3 调用Tesseract-OCR在调用前要导入 pytesseract 包。pip install pytesseract 随后在YOLOv5项目里新建一个py文件text = pytesseract.image_to_string(Image.open("test.png")) print(text)传入图片的路径后就可以在控制台看到最终输出的结果了。7.4 显示中文如果想显示车牌上的中文,我们还要下载一个东西,下载地址:tessdata/chi_sim.traineddata下载好后直接放到如下位置就可以。代码也要改动一下。8. 完整代码import cv2 from PIL import Image import pytesseract def Corver_Gray(image_path): # 读取模板图像 img = cv2.imread(image_path) # 转换为灰度图 也可读取时直接转换 ref = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # 二值图像 ref = cv2.threshold(ref, 60, 255, cv2.THRESH_BINARY_INV)[1] return ref def Read_Img(img_path): image = Corver_Gray(img_path) image = cv2.imwrite("test.png", image) return image Read_Img(r"D:\GitHub\Yolov5_Magic\number\1.png") text = pytesseract.image_to_string(Image.open("test.png")) print(text)9. 数据集及代码资源给个赞吧~链接:https://pan.baidu.com/s/1MKWPpb8dAcZwFQPqjCwTaA?pwd=csdn提取码:csdn10. 更多YOLOv5实战内容更多YOLOv5实战内容可以关注我的专栏
论文地址:https://arxiv.org/pdf/2108.11539.pdf项目地址:https://github.com/cv516Buaa/tph-yolov5在无人机捕获的场景中进行对象检测是最近的一项热门任务。由于无人机总是在不同的高度航行,物体尺度变化剧烈,给网络优化带来了负担。此外,高速和低空飞行会在密集的物体上带来运动模糊,这给物体识别带来了很大的挑战。为了解决上述两个问题,我们提出了 TPH-YOLOv5。在 YOLOv5 的基础上,我们增加了一个预测头来检测不同尺度的物体。然后我们用 Transformer Prediction Heads (TPH) 替换原来的预测头,以利用自注意力机制探索预测潜力。我们还集成了卷积块注意模型 (CBAM),以在具有密集对象的场景中找到注意区域。为了进一步改进我们提出的 TPH-YOLOv5,我们提供了许多有用的策略,例如数据增强、多尺度测试、多模型集成和利用额外的分类器。对数据集 VisDrone2021 的广泛实验表明,TPH-YOLOv5 具有良好的性能,在无人机捕获的场景中具有令人印象深刻的可解释性。在 DET-test-challenge 数据集上,TPH-YOLOv5 的 AP 结果为 39.18%,比之前的 SOTA 方法(DPNetV3)好 1.81%。在 VisDrone Challenge 2021 中,TPHYOLOv5 获得第 5 名,并与第 1 名模型(AP 39.43%)取得了良好的匹配结果。与基线模型(YOLOv5)相比,TPH-YOLOv5 提高了约 7%,令人鼓舞且具有竞争力。解决的问题TPH-YOLOv5旨在解决无人机影像中存在的两个问题:因无人机在不同的高度飞行,物体的尺度变化剧烈。高速和低空飞行对排列密集的物体带来了运动模糊。主要改进TPH-YOLOv5是在YOLOv5的基础上做了下列改进:新增了一个检测头来检测更小尺度的物体。用transformer prediction heads(TPH)替换原来的预测头部。将CBAM集成到YOLOv5中,帮助网络在大区域覆盖的图像中找到感兴趣的区域。其它一系列小tricks。TPH-YOLOv5网络结构如下:TPH模块作者使用了一个Transformer Encoder来代替一些卷积和CSP结构,将Transformer在视觉中应用,也是目前的主流趋势,Transformer具有独特的注意力机制,效果比原先更好。CBAM模块我发现作者公布的代码和图中的代码不一样,所以自己按照上面的图复现了一个,除了检测头以外,完全按照原文内容,这里我们可以参考这篇文章的结构改进自己的模型。因为这些模块我们文件里已经有了,所以我们直接改配置文件就可以了。# YOLOv5 🚀 by Ultralytics, GPL-3.0 license # 迪菲赫尔曼 https://blog.csdn.net/weixin_43694096?spm=1000.2115.3001.5343 # Parameters nc: 80 # number of classes depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple anchors: - [19,27, 44,40, 38,94] # P3/8 - [96,68, 86,152, 180,137] # P4/16 - [140,301, 303,264, 238,542] # P5/32 - [436,615, 739,380, 925,792] # P6/64 # YOLOv5 backbone backbone: # [from, number, module, args] [[-1, 1, Focus, [64, 3]], # 0-P1/2 [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 [-1, 3, C3, [128]], [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 [-1, 9, C3, [256]], [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 [-1, 9, C3, [512]], [-1, 1, Conv, [768, 3, 2]], # 7-P5/32 [-1, 1, SPP, [1024, [3, 5, 7]]], [-1, 3, C3TR, [1024, False]], # 9 ] # YOLOv5 head head: [[-1, 1, Conv, [768, 1, 1]], # 10 [-1, 1, nn.Upsample, [None, 2, 'nearest']], #11 [[-1, 6], 1, Concat, [1]], # 12 cat backbone P5 [-1, 3, C3, [768, False]], # 13 [-1, 1, CBAM, [768]], # 14 [-1, 1, Conv, [512, 1, 1]], # 15 [-1, 1, nn.Upsample, [None, 2, 'nearest']], #16 [[-1, 4], 1, Concat, [1]], # 17 cat backbone P4 [-1, 3, C3, [512, False]], # 18 [-1, 1, CBAM, [512]], # 19 [-1, 1, Conv, [256, 1, 1]], # 20 [-1, 1, nn.Upsample, [None, 2, 'nearest']], #21 [[-1, 2], 1, Concat, [1]], # 22 cat backbone P3 [-1, 3, C3TR, [256, False]], # 23 (P3/8-small) [-1, 1, CBAM, [256]], # 24 [-1, 1, Conv, [256, 3, 2]], # 25 [[-1, 20], 1, Concat, [1]], # cat head P4 #26 [-1, 3, C3TR, [512, False]], # 27 (P4/16-medium) [-1, 1, CBAM, [512]], # 28 [-1, 1, Conv, [512, 3, 2]], # 29 [[-1, 15], 1, Concat, [1]], # 30 cat head P5 [-1, 3, C3TR, [768, False]], # 31 (P5/32-large) [-1, 1, CBAM, [768]], # 32 [-1, 1, Conv, [768, 3, 2]], # 33 [[-1, 10], 1, Concat, [1]], # 34 cat head P6 [-1, 3, C3TR, [1024, False]],# 35 (P6/64-xlarge) [[23, 27, 31, 35], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5, P6) ]模型参数量parameters计算量GFLOPsTPH-YOLOv51000951034.8
IoU出现背景目标检测任务的损失函数一般由Classificition Loss(分类损失函数)和Bounding Box Regeression Loss(回归损失函数)两部分构成。因此,更好的定位有利于模型精度的提高。在IoU Loss提出来之前,检测上有关候选框的回归主要是通过坐标的回归损失来优化。但L1 Loss和L2 Loss存在比较大的问题:L1 Loss的问题:损失函数对x的导数为常数,在训练后期,x很小时,如果learning rate 不变,损失函数会在稳定值附近波动,很难收敛到更高的精度。L2 Loss的问题:损失函数对x的导数在x值很大时,其导数也非常大,在训练初期不稳定。而且,基于L1/L2 Loss的坐标回归不具有尺度不变性,且并没有将四个坐标之间的相关性考虑进去。因此,像L1/L2 Loss直接的坐标回归实际上很难描述两框之间的相对位置关系。因此,在ACM2016的论文中提出了IoU loss,它将四个坐标点看成一个整体进行计算,具有尺度不变性(也就是对尺度不敏感)。IoU Loss的定义是先求出预测框和真实框之间的交集和并集之比,再求负对数,但是在实际使用中我们常常将IoU Loss写成1-IoU。如果两个框重合则交并比等于1,Loss为0说明重合度非常高。因此,IoU的取值范围为[0,1]。什么是IoU?IOU的全称为交并比(Intersection over Union),是目标检测中使用的一个概念,IoU计算的是“预测的边框”和“真实的边框”的交叠率,即它们的交集和并集的比值。最理想情况是完全重叠,即比值为1。IoU发展历程虽然IoU Loss虽然解决了Smooth L1系列变量相互独立和不具有尺度不变性的两大问题,但是它也存在两个问题:当预测框和目标框不相交时,即IoU(A,B)=0时,不能反映A,B距离的远近,此时损失函数不可导,IoU Loss 无法优化两个框不相交的情况。如上图三个框,假设预测框和目标框的大小都确定,只要两个框的相交值是确定的,即其IoU值相同时,IoU值不能反映两个框是如何相交的。GIoU(CVPR2019)针对IoU无法反映两个框是如何相交的问题,GIoU通过引入预测框和真实框的最小外接矩形(类似于图像处理中的闭包区域)来获取预测框、真实框在闭包区域中的比重。这样子,GIoU不仅可以关注重叠区域,还可以关注其他非重合区域,能比较好的反映两个框在闭包区域中的相交情况。从公式上来看,GIoU是一种IoU的下界,取值范围[-1,1]。在两者重合的时候取最大值1,在两者无交集且无限远的时候取最小值-1。因此,与IoU相比,GIoU是一个比较好的距离度量指标。DIoU(AAAI2020)虽然GIoU通过引入闭包区域缓解了预测框与真实框相交位置的衡量问题,但其实际上仍存在两个问题:对每个预测框与真实框均要去计算最小外接矩形,计算及收敛速度受到限制当预测框在真实框内部时,GIoU退化为IoU,也无法区分相对位置关系因此,考虑到GIoU的缺点,DIoU在IoU的基础上直接回归两个框中心点的欧式距离,加速了收敛速度。DIoU的惩罚项是基于中心点的距离和对角线距离的比值。这样就避免了GIoU在两框距离较远时产生较大闭包时所造成的Loss值较大而难以优化的情况。CIoU(AAAI2020)虽然DIoU Loss通过中心点回归缓解了两框距离较远时难优化的问题,但DIoU Loss仍存在两框中心点重合,但宽高比不同时,DIoU Loss退化为IoU Loss的问题。因此,为了得到更加精准的预测框,CIoU在DIoU的基础上增加了一个影响因子,即增加了预测框与真实框之间长宽比的一致性的考量。比如上面三种情况,目标框包裹预测框,本来DIoU可以起作用。但预测框的中心点的位置都是一样的,因此按照DIoU的计算公式,三者的值都是相同的。CIoU Loss虽然考虑了边界框回归的重叠面积、中心点距离及长宽比。但是其公式中的v反映的时长宽比的差异,而不是宽高分别与其置信度的真实差异,所以有时会阻碍模型有效的优化。EIoU(arXiv2021)EIoU在CIoU的基础上将长宽比拆开,明确地衡量了三个几何因素的差异,即重叠区域、中心点和边长,同时引入Fcoal loss解决了难易样本不平衡的问题。αIoU(NeurlPS2021)αIoU将现有的基于IoU 的损失进行了一个推广使得αIoU可以显着超越现有的基于 IoU 的损失,通过调节α,使探测器更灵活地实现不同水平的bbox回归精度,并且αIoU对小数据集和噪声的鲁棒性更强SIoU(arXiv2022)传统的目标检测损失函数依赖于边界框回归指标的聚合,例如预测框和真实框(即 GIoU、CIoU、ICIoU 等)的距离、重叠区域和纵横比。然而,迄今为止提出和使用的方法都没有考虑期望的真实框和预测框之间不匹配的方向。这种不足导致收敛速度较慢且效率较低,因为预测框在训练过程中可能会“四处游荡”,最终会产生一个更差的模型。SIoU提出了一种新的损失函数,重新定义了惩罚度量,考虑了期望回归之间的向量夹角。SIoU损失函数由4个成本函数组成Angle costDistance costShape costIoU cost将 SIoU 应用于 COCO-train/COCO-val 与其他损失函数相比,提高了 +2.4% (mAP@0.5:0.95) 和 +3.6%(mAP@0.5)各IoU源代IoUimport numpy as np def Iou(box1, box2, wh=False): if wh == False: xmin1, ymin1, xmax1, ymax1 = box1 xmin2, ymin2, xmax2, ymax2 = box2 else: xmin1, ymin1 = int(box1[0]-box1[2]/2.0), int(box1[1]-box1[3]/2.0) xmax1, ymax1 = int(box1[0]+box1[2]/2.0), int(box1[1]+box1[3]/2.0) xmin2, ymin2 = int(box2[0]-box2[2]/2.0), int(box2[1]-box2[3]/2.0) xmax2, ymax2 = int(box2[0]+box2[2]/2.0), int(box2[1]+box2[3]/2.0) # 获取矩形框交集对应的左上角和右下角的坐标(intersection) xx1 = np.max([xmin1, xmin2]) yy1 = np.max([ymin1, ymin2]) xx2 = np.min([xmax1, xmax2]) yy2 = np.min([ymax1, ymax2]) # 计算两个矩形框面积 area1 = (xmax1-xmin1) * (ymax1-ymin1) area2 = (xmax2-xmin2) * (ymax2-ymin2) inter_area = (np.max([0, xx2-xx1])) * (np.max([0, yy2-yy1])) #计算交集面积 iou = inter_area / (area1+area2-inter_area+1e-6) #计算交并比 return iouGIoUdef Giou(rec1,rec2): #分别是第一个矩形左右上下的坐标 x1,x2,y1,y2 = rec1 x3,x4,y3,y4 = rec2 iou = Iou(rec1,rec2) area_C = (max(x1,x2,x3,x4)-min(x1,x2,x3,x4))*(max(y1,y2,y3,y4)-min(y1,y2,y3,y4)) area_1 = (x2-x1)*(y1-y2) area_2 = (x4-x3)*(y3-y4) sum_area = area_1 + area_2 w1 = x2 - x1 #第一个矩形的宽 w2 = x4 - x3 #第二个矩形的宽 h1 = y1 - y2 h2 = y3 - y4 W = min(x1,x2,x3,x4)+w1+w2-max(x1,x2,x3,x4) #交叉部分的宽 H = min(y1,y2,y3,y4)+h1+h2-max(y1,y2,y3,y4) #交叉部分的高 Area = W*H #交叉的面积 add_area = sum_area - Area #两矩形并集的面积 end_area = (area_C - add_area)/area_C #闭包区域中不属于两个框的区域占闭包区域的比重 giou = iou - end_area return giouDIoUdef Diou(bboxes1, bboxes2): rows = bboxes1.shape[0] cols = bboxes2.shape[0] dious = torch.zeros((rows, cols)) if rows * cols == 0:# return dious exchange = False if bboxes1.shape[0] > bboxes2.shape[0]: bboxes1, bboxes2 = bboxes2, bboxes1 dious = torch.zeros((cols, rows)) exchange = True # #xmin,ymin,xmax,ymax->[:,0],[:,1],[:,2],[:,3] w1 = bboxes1[:, 2] - bboxes1[:, 0] h1 = bboxes1[:, 3] - bboxes1[:, 1] w2 = bboxes2[:, 2] - bboxes2[:, 0] h2 = bboxes2[:, 3] - bboxes2[:, 1] area1 = w1 * h1 area2 = w2 * h2 center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2 center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2 center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2 center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2 inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:]) inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2]) out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:]) out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2]) inter = torch.clamp((inter_max_xy - inter_min_xy), min=0) inter_area = inter[:, 0] * inter[:, 1] inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2 outer = torch.clamp((out_max_xy - out_min_xy), min=0) outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2) union = area1+area2-inter_area dious = inter_area / union - (inter_diag) / outer_diag dious = torch.clamp(dious,min=-1.0,max = 1.0) if exchange: dious = dious.T return diousCIoUdef bbox_overlaps_ciou(bboxes1, bboxes2): rows = bboxes1.shape[0] cols = bboxes2.shape[0] cious = torch.zeros((rows, cols)) if rows * cols == 0: return cious exchange = False if bboxes1.shape[0] > bboxes2.shape[0]: bboxes1, bboxes2 = bboxes2, bboxes1 cious = torch.zeros((cols, rows)) exchange = True w1 = bboxes1[:, 2] - bboxes1[:, 0] h1 = bboxes1[:, 3] - bboxes1[:, 1] w2 = bboxes2[:, 2] - bboxes2[:, 0] h2 = bboxes2[:, 3] - bboxes2[:, 1] area1 = w1 * h1 area2 = w2 * h2 center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2 center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2 center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2 center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2 inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:]) inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2]) out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:]) out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2]) inter = torch.clamp((inter_max_xy - inter_min_xy), min=0) inter_area = inter[:, 0] * inter[:, 1] inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2 outer = torch.clamp((out_max_xy - out_min_xy), min=0) outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2) union = area1+area2-inter_area u = (inter_diag) / outer_diag iou = inter_area / union with torch.no_grad(): arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1) v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2) S = 1 - iou alpha = v / (S + v) w_temp = 2 * w1 ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1) cious = iou - (u + alpha * ar) cious = torch.clamp(cious,min=-1.0,max = 1.0) if exchange: cious = cious.T return cious参考文献https://mp.weixin.qq.com/s/jLnde0Xms-99g4z16OE9VQDIoU、CIoU、GIoU、IoU再理解结合代码IoU:《UnitBox: An Advanced Object Detection Network》GIoU:《Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression》D/C IoU:《Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression》EIoU:《Focal and Efficient IOU Loss for Accurate Bounding Box Regression》αIoU:《Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression》SIoU:《SIoU Loss: More Powerful Learning for Bounding Box Regression》
1.5 RFB(Receptive Field Block)RFB模块是在《ECCV2018:Receptive Field Block Net for Accurate and Fast Object Detection》一文中提出的,该文的出发点是模拟人类视觉的感受野从而加强网络的特征提取能力,在结构上RFB借鉴了Inception的思想,主要是在Inception的基础上加入了空洞卷积,从而有效增大了感受野RFB和RFB-s的架构。RFB-s用于在浅层人类视网膜主题图中模拟较小的pRF,使用具有较小内核的更多分支。class BasicConv(nn.Module): def __init__(self, in_planes, out_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True): super(BasicConv, self).__init__() self.out_channels = out_planes if bn: self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=False) self.bn = nn.BatchNorm2d(out_planes, eps=1e-5, momentum=0.01, affine=True) self.relu = nn.ReLU(inplace=True) if relu else None else: self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=True) self.bn = None self.relu = nn.ReLU(inplace=True) if relu else None def forward(self, x): x = self.conv(x) if self.bn is not None: x = self.bn(x) if self.relu is not None: x = self.relu(x) return x class BasicRFB(nn.Module): def __init__(self, in_planes, out_planes, stride=1, scale=0.1, map_reduce=8, vision=1, groups=1): super(BasicRFB, self).__init__() self.scale = scale self.out_channels = out_planes inter_planes = in_planes // map_reduce self.branch0 = nn.Sequential( BasicConv(in_planes, inter_planes, kernel_size=1, stride=1, groups=groups, relu=False), BasicConv(inter_planes, 2 * inter_planes, kernel_size=(3, 3), stride=stride, padding=(1, 1), groups=groups), BasicConv(2 * inter_planes, 2 * inter_planes, kernel_size=3, stride=1, padding=vision + 1, dilation=vision + 1, relu=False, groups=groups) ) self.branch1 = nn.Sequential( BasicConv(in_planes, inter_planes, kernel_size=1, stride=1, groups=groups, relu=False), BasicConv(inter_planes, 2 * inter_planes, kernel_size=(3, 3), stride=stride, padding=(1, 1), groups=groups), BasicConv(2 * inter_planes, 2 * inter_planes, kernel_size=3, stride=1, padding=vision + 2, dilation=vision + 2, relu=False, groups=groups) ) self.branch2 = nn.Sequential( BasicConv(in_planes, inter_planes, kernel_size=1, stride=1, groups=groups, relu=False), BasicConv(inter_planes, (inter_planes // 2) * 3, kernel_size=3, stride=1, padding=1, groups=groups), BasicConv((inter_planes // 2) * 3, 2 * inter_planes, kernel_size=3, stride=stride, padding=1, groups=groups), BasicConv(2 * inter_planes, 2 * inter_planes, kernel_size=3, stride=1, padding=vision + 4, dilation=vision + 4, relu=False, groups=groups) ) self.ConvLinear = BasicConv(6 * inter_planes, out_planes, kernel_size=1, stride=1, relu=False) self.shortcut = BasicConv(in_planes, out_planes, kernel_size=1, stride=stride, relu=False) self.relu = nn.ReLU(inplace=False) def forward(self, x): x0 = self.branch0(x) x1 = self.branch1(x) x2 = self.branch2(x) out = torch.cat((x0, x1, x2), 1) out = self.ConvLinear(out) short = self.shortcut(x) out = out * self.scale + short out = self.relu(out) return out1.6 SPPCSPC该模块是YOLOv7中使用的SPP结构,表现优于SPPF,但参数量和计算量提升了很多class SPPCSPC(nn.Module): # CSP https://github.com/WongKinYiu/CrossStagePartialNetworks def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)): super(SPPCSPC, self).__init__() c_ = int(2 * c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c1, c_, 1, 1) self.cv3 = Conv(c_, c_, 3, 1) self.cv4 = Conv(c_, c_, 1, 1) self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k]) self.cv5 = Conv(4 * c_, c_, 1, 1) self.cv6 = Conv(c_, c_, 3, 1) self.cv7 = Conv(2 * c_, c2, 1, 1) def forward(self, x): x1 = self.cv4(self.cv3(self.cv1(x))) y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1))) y2 = self.cv2(x) return self.cv7(torch.cat((y1, y2), dim=1))#分组SPPCSPC 分组后参数量和计算量与原本差距不大,不知道效果怎么样 class SPPCSPC_group(nn.Module): def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)): super(SPPCSPC_group, self).__init__() c_ = int(2 * c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1, g=4) self.cv2 = Conv(c1, c_, 1, 1, g=4) self.cv3 = Conv(c_, c_, 3, 1, g=4) self.cv4 = Conv(c_, c_, 1, 1, g=4) self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k]) self.cv5 = Conv(4 * c_, c_, 1, 1, g=4) self.cv6 = Conv(c_, c_, 3, 1, g=4) self.cv7 = Conv(2 * c_, c2, 1, 1, g=4) def forward(self, x): x1 = self.cv4(self.cv3(self.cv1(x))) y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1))) y2 = self.cv2(x) return self.cv7(torch.cat((y1, y2), dim=1))1.7 SPPFCSPC🍀我借鉴了SPPF的思想将SPPCSPC优化了一下,得到了SPPFCSPC,在保持感受野不变的情况下获得速度提升;我把这个模块给v7作者看了,并没有得到否定,详细回答可以看4 IssueA:Max pooling uses very few computation, if you programming well, above one could run three max pool layers in parallel, while below one must process three max pool layers sequentially.By the way, you could replace SPPCSPC by SPPFCSPC at inference time if your hardware is friendly to SPPFCSPC.有问题欢迎大家指正,如果感觉有帮助的话请点赞支持下👍📖🌟参考文献:增强感受野SPP、ASPP、RFB、PPM
项目地址:Yolov5_Magic分享一些改进YOLOv5的技巧,不同的数据集效果肯定是不同的,有算力的话还是要多尝试有关代码怎么使用,大家可以去看我的博文,或者官方的文档,我在这统一做一个汇总手把手带你调参Yolo v5 (v6.2)(推理)🌟强烈推荐手把手带你调参Yolo v5 (v6.2)(训练)🚀手把手带你调参Yolo v5 (v6.2)(验证)如何快速使用自己的数据集训练Yolov5模型手把手带你Yolov5 (v6.2)添加注意力机制(一)(并附上30多种顶会Attention原理图)🌟强烈推荐🍀新增8种手把手带你Yolov5 (v6.2)添加注意力机制(二)(在C3模块中加入注意力机制)Yolov5如何更换激活函数?Yolov5如何更换BiFPN?Yolov5 (v6.2)数据增强方式解析Yolov5更换上采样方式( 最近邻 / 双线性 / 双立方 / 三线性 / 转置卷积)Yolov5如何更换EIOU / alpha IOU / SIoU?Yolov5更换主干网络之《旷视轻量化卷积神经网络ShuffleNetv2》YOLOv5应用轻量级通用上采样算子CARAFE空间金字塔池化改进 SPP / SPPF / SimSPPF / ASPP / RFB / SPPCSPC / SPPFCSPC🚀用于低分辨率图像和小物体的模块SPD-ConvGSConv+Slim-neck 减轻模型的复杂度同时提升精度🍀头部解耦 | 将YOLOX解耦头添加到YOLOv5 | 涨点杀器🍀Stand-Alone Self-Attention | 搭建纯注意力FPN+PAN结构🍀YOLOv5模型剪枝实战🚀YOLOv5知识蒸馏实战🚀YOLOv7知识蒸馏实战参数量与计算量(以yolov5s为baseline)注意力:SPP结构:Others:实验结果(仅供参考)项目所用的配置文件我都放在我的Github了,项目地址:Yolov5_Magic还有一些其他tircks的实验结果我正在整理中,后续我会更新在Github的
1.第一版本添加方式介绍1.1 C3SE第一步;要把注意力结构代码放到common.py文件中,以C3SE举例,将这段代码粘贴到common.py文件中class SEBottleneck(nn.Module): # Standard bottleneck def __init__(self, c1, c2, shortcut=True, g=1, e=0.5, ratio=16): # ch_in, ch_out, shortcut, groups, expansion super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c_, c2, 3, 1, g=g) self.add = shortcut and c1 == c2 # self.se=SE(c1,c2,ratio) self.avgpool = nn.AdaptiveAvgPool2d(1) self.l1 = nn.Linear(c1, c1 // ratio, bias=False) self.relu = nn.ReLU(inplace=True) self.l2 = nn.Linear(c1 // ratio, c1, bias=False) self.sig = nn.Sigmoid() def forward(self, x): x1 = self.cv2(self.cv1(x)) b, c, _, _ = x.size() y = self.avgpool(x1).view(b, c) y = self.l1(y) y = self.relu(y) y = self.l2(y) y = self.sig(y) y = y.view(b, c, 1, 1) out = x1 * y.expand_as(x1) # out=self.se(x1)*x1 return x + out if self.add else out class C3SE(C3): # C3 module with SEBottleneck() def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): super().__init__(c1, c2, n, shortcut, g, e) c_ = int(c2 * e) # hidden channels self.m = nn.Sequential(*(SEBottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n))) 第二步;找到yolo.py文件里的parse_model函数,将类名加入进去第三步;修改配置文件(我这里拿yolov5s.yaml举例子),将C3层替换为我们新引入的C3SE层yolov5s_C3SE.yaml# YOLOv5 🚀 by Ultralytics, GPL-3.0 license # Parameters nc: 80 # number of classes depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple anchors: - [10,13, 16,30, 33,23] # P3/8 - [30,61, 62,45, 59,119] # P4/16 - [116,90, 156,198, 373,326] # P5/32 # YOLOv5 v6.0 backbone backbone: # [from, number, module, args] [[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2 [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 [-1, 3, C3SE, [128]], [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 [-1, 6, C3SE, [256]], [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 [-1, 9, C3SE, [512]], [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 [-1, 3, C3SE, [1024]], [-1, 1, SPPF, [1024, 5]], # 9 ] # YOLOv5 v6.0 head head: [[-1, 1, Conv, [512, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 6], 1, Concat, [1]], # cat backbone P4 [-1, 3, C3, [512, False]], # 13 [-1, 1, Conv, [256, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 4], 1, Concat, [1]], # cat backbone P3 [-1, 3, C3, [256, False]], # 17 (P3/8-small) [-1, 1, Conv, [256, 3, 2]], [[-1, 14], 1, Concat, [1]], # cat head P4 [-1, 3, C3, [512, False]], # 20 (P4/16-medium) [-1, 1, Conv, [512, 3, 2]], [[-1, 10], 1, Concat, [1]], # cat head P5 [-1, 3, C3, [1024, False]], # 23 (P5/32-large) [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) ]其它注意力机制同理1.2 C3CAclass CABottleneck(nn.Module): # Standard bottleneck def __init__(self, c1, c2, shortcut=True, g=1, e=0.5, ratio=32): # ch_in, ch_out, shortcut, groups, expansion super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c_, c2, 3, 1, g=g) self.add = shortcut and c1 == c2 # self.ca=CoordAtt(c1,c2,ratio) self.pool_h = nn.AdaptiveAvgPool2d((None, 1)) self.pool_w = nn.AdaptiveAvgPool2d((1, None)) mip = max(8, c1 // ratio) self.conv1 = nn.Conv2d(c1, mip, kernel_size=1, stride=1, padding=0) self.bn1 = nn.BatchNorm2d(mip) self.act = h_swish() self.conv_h = nn.Conv2d(mip, c2, kernel_size=1, stride=1, padding=0) self.conv_w = nn.Conv2d(mip, c2, kernel_size=1, stride=1, padding=0) def forward(self, x): x1=self.cv2(self.cv1(x)) n, c, h, w = x.size() # c*1*W x_h = self.pool_h(x1) # c*H*1 # C*1*h x_w = self.pool_w(x1).permute(0, 1, 3, 2) y = torch.cat([x_h, x_w], dim=2) # C*1*(h+w) y = self.conv1(y) y = self.bn1(y) y = self.act(y) x_h, x_w = torch.split(y, [h, w], dim=2) x_w = x_w.permute(0, 1, 3, 2) a_h = self.conv_h(x_h).sigmoid() a_w = self.conv_w(x_w).sigmoid() out = x1 * a_w * a_h # out=self.ca(x1)*x1 return x + out if self.add else out class C3CA(C3): # C3 module with CABottleneck() def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): super().__init__(c1, c2, n, shortcut, g, e) c_ = int(c2 * e) # hidden channels self.m = nn.Sequential(*(CABottleneck(c_, c_,shortcut, g, e=1.0) for _ in range(n)))1.3 C3CBAMclass CBAMBottleneck(nn.Module): # Standard bottleneck def __init__(self, c1, c2, shortcut=True, g=1, e=0.5,ratio=16,kernel_size=7): # ch_in, ch_out, shortcut, groups, expansion super(CBAMBottleneck,self).__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c_, c2, 3, 1, g=g) self.add = shortcut and c1 == c2 self.channel_attention = ChannelAttention(c2, ratio) self.spatial_attention = SpatialAttention(kernel_size) #self.cbam=CBAM(c1,c2,ratio,kernel_size) def forward(self, x): x1 = self.cv2(self.cv1(x)) out = self.channel_attention(x1) * x1 # print('outchannels:{}'.format(out.shape)) out = self.spatial_attention(out) * out return x + out if self.add else out class C3CBAM(C3): # C3 module with CBAMBottleneck() def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): super().__init__(c1, c2, n, shortcut, g, e) c_ = int(c2 * e) # hidden channels self.m = nn.Sequential(*(CBAMBottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))1.4 C3ECAclass ECABottleneck(nn.Module): # Standard bottleneck def __init__(self, c1, c2, shortcut=True, g=1, e=0.5, ratio=16, k_size=3): # ch_in, ch_out, shortcut, groups, expansion super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c_, c2, 3, 1, g=g) self.add = shortcut and c1 == c2 # self.eca=ECA(c1,c2) self.avg_pool = nn.AdaptiveAvgPool2d(1) self.conv = nn.Conv1d(1, 1, kernel_size=k_size, padding=(k_size - 1) // 2, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): x1 = self.cv2(self.cv1(x)) # out=self.eca(x1)*x1 y = self.avg_pool(x1) y = self.conv(y.squeeze(-1).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-1) y = self.sigmoid(y) out = x1 * y.expand_as(x1) return x + out if self.add else out class C3ECA(C3): # C3 module with ECABottleneck() def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): super().__init__(c1, c2, n, shortcut, g, e) c_ = int(c2 * e) # hidden channels self.m = nn.Sequential(*(ECABottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))2.第二版本添加方式介绍2.1 C3_SE_Attentionclass C3_SE_Attention(nn.Module): # CSP Bottleneck with 3 convolutions def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c1, c_, 1, 1) self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2) self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n))) self._SE = SE(c2, c2) def forward(self, x): return self._SE(self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1)))2.2 C3_ECA_Attentionclass C3_ECA_Attention(nn.Module): # CSP Bottleneck with 3 convolutions def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c1, c_, 1, 1) self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2) self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n))) self._ECA = ECA(c2, c2) def forward(self, x): return self._ECA(self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1)))2.3 C3_CBAM_Attentionclass C3_CBAM_Attention(nn.Module): # CSP Bottleneck with 3 convolutions def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c1, c_, 1, 1) self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2) self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n))) self._CBAM = CBAM(c2, c2) def forward(self, x): return self._CBAM(self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1)))2.4 C3_CoorAtt_Attentionclass C3_CoordAtt_Attention(nn.Module): # CSP Bottleneck with 3 convolutions def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion super().__init__() c_ = int(c2 * e) # hidden channels self.cv1 = Conv(c1, c_, 1, 1) self.cv2 = Conv(c1, c_, 1, 1) self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2) self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n))) self._CoordAtt = CoordAtt(c2, c2) def forward(self, x): return self._CoordAtt(self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1)))第二版配置文件:# YOLOv5 🚀 by Ultralytics, GPL-3.0 license # Parameters nc: 80 # number of classes depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple anchors: - [10,13, 16,30, 33,23] # P3/8 - [30,61, 62,45, 59,119] # P4/16 - [116,90, 156,198, 373,326] # P5/32 # YOLOv5 v6.0 backbone backbone: # [from, number, module, args] [[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2 [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 [-1, 3, C3_CoorAtt_Attention, [128]], #可替换为C3_SE_Attention/C3_ECA_Attention/C3_CABAM_Attention [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 [-1, 6, C3_CoorAtt_Attention, [256]], #可替换为C3_SE_Attention/C3_ECA_Attention/C3_CABAM_Attention [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 [-1, 9, C3_CoorAtt_Attention, [512]], #可替换为C3_SE_Attention/C3_ECA_Attention/C3_CABAM_Attention [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 [-1, 3, C3_CoorAtt_Attention, [1024]], #可替换为C3_SE_Attention/C3_ECA_Attention/C3_CABAM_Attention [-1, 1, SPPF, [1024, 5]], # 9 ] # YOLOv5 v6.0 head head: [[-1, 1, Conv, [512, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 6], 1, Concat, [1]], # cat backbone P4 [-1, 3, C3, [512, False]], # 13 [-1, 1, Conv, [256, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 4], 1, Concat, [1]], # cat backbone P3 [-1, 3, C3, [256, False]], # 17 (P3/8-small) [-1, 1, Conv, [256, 3, 2]], [[-1, 14], 1, Concat, [1]], # cat head P4 [-1, 3, C3, [512, False]], # 20 (P4/16-medium) [-1, 1, Conv, [512, 3, 2]], [[-1, 10], 1, Concat, [1]], # cat head P5 [-1, 3, C3, [1024, False]], # 23 (P5/32-large) [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) ]有问题欢迎大家指正,如果感觉有帮助的话请点赞支持下👍📖🌟
1.4"–hyp"指定超参数文件的路径;超参数里面包含了大量的参数信息,同样提供了5个1.5"–epochs"训练的轮数;默认为300轮,显示效果是0-2991.6"–batch-size"每批次的输入数据量;default=-1将时自动调节batchsize大小这里说一下 epoch、batchsize、iteration三者之间的联系1、batchsize是批次大小,假如取batchsize=24,则表示每次训练时在训练集中取24个训练样本进行训练。2、iteration是迭代次数,1个iteration就等于一次使用24(batchsize大小)个样本进行训练。3、epoch:1个epoch就等于使用训练集中全部样本训练1次。1.7 “–imgsz, --img, --img-size”训练集和测试集图片的像素大小;输入默认640*640,这个参数在你选择yolov5l那些大一点的权重的时候,要进行适当的调整,这样才能达到好的效果。1.8"–rect"所谓矩阵推理就是不再要求你训练的图片是正方形了;矩阵推理会加速模型的推理过程,减少一些冗余信息。下图分别是方形推理方式和矩阵推理方式1.9"–resume"断点续训:即是否在之前训练的一个模型基础上继续训练,default 值默认是 false;如果想采用断点续训的方式,这里我推荐一种写法,即首先将default=False 改为 default=True随后在终端中键入如下指令:python train.py --resume D:\Pycharm_Projects\yolov5-6.1-4_23\runs\train\exp19\weights\last.ptD:\Pycharm_Projects\yolov5-6.1-4_23\runs\train\exp19\weights\last.pt为你上一次中断时保存的pt文件路径输入指令后就可以看到模型是继续从上次结束时开始训练的1.10"–nosave"是否只保存最后一轮的pt文件;我们默认是保存best.pt和last.pt两个的1.11"–noval"只在最后一轮测试;正常情况下每个epoch都会计算mAP,但如果开启了这个参数,那么就只在最后一轮上进行测试,不建议开启1.12"–noautoanchor"是否禁用自动锚框;默认是开启的,自动锚点的好处是可以简化训练过程;yolov5中预先设定了一下锚定框,这些锚框是针对coco数据集的,其他目标检测也适用,可以在models/yolov5.文件中查看,例如如图所示,这些框针对的图片大小是640640。这是默认的anchor大小。需要注意的是在目标检测任务中,一般使用大特征图上去检测小目标,因为大特征图含有更多小目标信息,因此大特征图上的anchor数值通常设置为小数值,小特征图检测大目标,因此小特征图上anchor数值设置较大。在yolov5 中自动锚定框选项,训练开始前,会自动计算数据集标注信息针对默认锚定框的最佳召回率,当最佳召回率大于等于0.98时,则不需要更新锚定框;如果最佳召回率小于0.98,则需要重新计算符合此数据集的锚定框。在parse_opt设置了默认自动计算锚框选项,如果不想自动计算,可以设置这个,建议不要改动。1.13"–noplots"🍀开启这个参数后将不保存绘图文件1.14"–evolve"遗传超参数进化;yolov5使用遗传超参数进化,提供的默认参数是通过在COCO数据集上使用超参数进化得来的(也就是下图这些参数)。由于超参数进化会耗费大量的资源和时间,所以建议大家不要动这个参数。遗传算法是利用种群搜索技术将种群作为一组问题解,通过对当前种群施加类似生物遗传环境因素的选择、交叉、变异等一系列的遗传操作来产生新一代的种群,并逐步使种群优化到包含近似最优解的状态,遗传算法调优能够求出优化问题的全局最优解,优化结果与初始条件无关,算法独立于求解域,具有较强的鲁棒性,适合于求解复杂的优化问题,应用较为广泛。1.15"–bucket"谷歌云盘;通过这个参数可以下载谷歌云盘上的一些东西,但是现在没必要使用了1.16"–cache"是否提前缓存图片到内存,以加快训练速度,默认False;开启这个参数就会对图片进行缓存,从而更好的训练模型。1.17"–image-weights"是否启用加权图像策略,默认是不开启的;主要是为了解决样本不平衡问题;开启后会对于上一轮训练效果不好的图片,在下一轮中增加一些权重;1.18"–device"设备选择;这个参数就是指定硬件设备的,系统会自己判断的
2022年12月