LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image

简介: 高效的特征提取EFE模块作为主干单元,它可以用很少的参数和低计算量提取有意义的特征,有效地学习表征。大大减少了特征提取的消耗

LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image


LF-YOLO:用于x射线图像焊缝缺陷检测的更轻、更快的YOLO


原因:不同类型缺陷的形状和规模差异很大,这给模型检测焊接缺陷带来了挑战。

改进模块:RMF(多尺度改进模块),EFE(减少计算量)

RMF 的新型多尺度融合模块。它可以通过同时使用基于参数和无参数的方法来结合 X 射线图像的局部和全局线索。

高效的特征提取EFE模块作为主干单元,它可以用很少的参数和低计算量提取有意义的特征,有效地学习表征。大大减少了特征提取的消耗。


Abstract


X-ray image plays an important role in manufacturing industry for quality assurance, because it can reflect the internal condition of weld region.However, the shape and scale of different defect types vary greatly, which makes it challenging for model to detect weld defects.


x射线图像能反映焊接区域的内部状况,在保证焊接质量方面起着重要作用。然而,不同类型缺陷的形状和规模差异很大,这给模型检测焊接缺陷带来了挑战。


a reinforced multiscale feature (RMF) module is designed to implement both parameter-based and parameter-free multi-scale information extracting operation.RMF enables the extracted feature map capable to represent more plentiful information, which is achieved by superior hierarchical fusion structure.


设计增强的多尺度特征(RMF)模块,实现了基于参数和无参数的多尺度信息提取操作。RMF使提取的特征映射能够表示更丰富的信息,这是通过更高层次的融合结构来实现的。


To improve the performance of detection network, we propose an efficient feature extraction (EFE) module.To further prove the ability of our method, we test it on public dataset MS COCO, and the results show that our LF-YOLO has a outstanding versatility detection performance.


为了提高检测网络的性能,我们提出了一种高效的特征提取(EFE)模块。为了进一步证明我们的方法的能力,我们在公共数据集MS COCO上进行了测试,结果表明我们的LF-YOLO具有出色的通用性检测性能。


I. INTRODUCTION


However, either manual or robotic welding will inevitably produce weld defects, which is a potential hazard for daily production.people utilize X-ray technology to reflect internal defect of weld into image as shown in Fig. 1, and detect them through expert or computer vision model.


但是无论是手工焊接还是机器人焊接,都会不可避免地产生焊接缺陷,这对日常生产都是一个潜在的危害。人们利用x射线技术将焊缝内部缺陷反映成如图1所示的图像,并通过专家或计算机视觉模型进行检测。


edb9d975d4094425848b1ab20d8c273b.png


The context of weld image is complicated, and there are blurred boundaries and similar texture between defect and background. In addition, the scales and shapes of defects vary greatly among different classes, which can be seen in Fig. 2.


焊缝图像背景复杂,缺陷与背景之间边界模糊,纹理相似。此外,从图2可以看出,不同类型缺陷的尺度和形状差异较大。

447a9ce7eb004304a3febec28da955d5.png


All of these factors bring great challenges to the detection model [3], and it is required to capture abundant contextual information.


这些因素都给检测模型[3]带来了很大的挑战,需要获取丰富的上下文信息。


local feature is beneficial to represent the boundary, shape, and geometric texture of defect, while global feature is vital for classification and distinguishing foreground and background.


局部特征有利于表示缺陷的边界、形状和几何纹理,而全局特征对于前景和背景的分类和区分至关重要。


In this paper, we propose an reinforced multiscale feature (RMF) module, which combines both of parameter-based and parameter-free operations.


本文提出了一种基于参数和无参数操作相结合的增强多尺度特征(RMF)模块。


RMF module firstly contains a basic parameter-free hierarchical structure, which generates multiple feature maps obtained from maxpool operations of different sizes.


RMF模块首先包含一个基本的无参数层次结构,通过不同大小的maxpool操作生成多个特征映射。

Furthermore, within each branch of basic hier- archy, new features are produced through learning potential information implicitly, and the process is parameter-based.


此外,在基本层次结构的每个分支中,新特征是通过隐式学习潜在信息产生的,这个过程是基于参数的。


Finally, the output data of each hierarchy would be fused for finer estimation. Besides the contribution of multi-scale feature utilization, original feature extraction also determines the performance of the network.


最后,对各层次的输出数据进行融合,进行更精细的估计。除了多尺度特征利用的贡献外,原始特征提取也决定了网络的性能。


To effectively extract feature of weld defect, we design an efficient feature extraction (EFE) module elaborately, and build a superior backbone by stacking EFE repeatedly.


为了有效地提取焊缝缺陷特征,我们精心设计了一个高效的特征提取(EFE)模块,并通过反复叠加EFE构建了一个优质的主干。


In summary, this work makes the following contributions.


总而言之,这项工作有以下贡献。


A novel multi-scale fusion module named RMF is pro- posed. It can combine local and global cues of X- ray image by using parameter-based and parameter-free methods simultaneously.


提出了一种新的多尺度融合模块RMF。它可以同时使用基于参数和无参数的方法来结合X射线图像的局部线索和全局线索。


To efficiently learn representation, we design a novel EFE module as the unit of backbone, and it can extract mean- ingful feature with few parameters and low computation.


为了高效地学习表示,我们设计了一种新颖的EFE模块作为骨干单元,它能以较少的参数和较低的计算量提取出均值特征。


deal with multiple defect classes, and the proposed network is memory and computation friendly.


该网络可以处理多个缺陷类,具有良好的内存和计算友好性。


III. METHOD


efficient feature extraction (EFE) module and reinforced multi- scale feature (RMF) module


高效特征提取(EFE)模块和增强多尺度特征(RMF)模块


A. EFE module


Feature extraction module is the basic block of deep learning network.


特征提取模块是深度学习网络的基本模块。


to better accomplish corresponding tasks. In addition, feature extraction operation is the main source of parameters and computation. Therefore, the weight of feature extraction module determines the weight of whole network.


更好地完成相应的任务。此外,特征提取操作是参数和计算的主要来源。因此,特征提取模块的权重决定了整个网络的权重。


Inspired by the inverted residual block in MobileNetV2 [22], EFE module maps the input data into a higher dimension space in the middle stage, because the expansion of feature space is beneficial to obtain more meaningful representation.


EFE模块受MobileNetV2[22]中反向残差块的启发,在中间阶段将输入数据映射到一个更高维的空间,因为特征空间的扩展有利于获得更有意义的表示。

MobileNetV2 [21] solves this problem by using depthwise separable convolutions. In this paper, we employ a more wise strategy.


MobileNetV2[21]通过使用深度可分离卷积解决了这个问题。在本文中,我们采用了一个更明智的策略。


Following the idea of [34], we design the middle expansion structure based on “split-transform-merge” theory. After the first 1×1 Conv, feature maps are split into two branches, and split ratio ra is set as 0.25 in this paper.


我们遵循[34]的思想,基于“分裂-转换-合并”理论设计了中间扩展结构。在进行了第一次1×1 Conv之后,特征映射被拆分为两个分支,本文设置拆分比ra为0.25。


One of them is an identity branch, which does not utilize any operation on the data. Another branch is a dense block in [35], which is used to further extract features.


其中之一是身份分支,它不利用对数据的任何操作。另一个分支是[35]中的密集块,用于进一步提取特征。


To optimize the complexity, EFE module introduces Ghost Conv [24].


为了优化复杂度,EFE模块引入了Ghost Conv[24]。


d0ee6f298a0b498283a9427eba5c6cd1.png


At the tail of EFE module, the second 1×1 Conv is used to compress the number of channels back to 2c/c. Finally, the input of expansion operation and the output of second 1×1 Conv are added element-wise by a residual branch.


在EFE模块的尾部,第二个1×1 Conv用于将通道数压缩回2c/c。最后,将展开运算的输入和第二个1×1 Conv的输出通过一个剩余分支逐项相加。


image.png


Compared with the conventional residual block, our EFE module greatly decreases the consumption of feature extraction.

与传统的残差块相比,该EFE模块大大减少了特征提取的消耗。


B. RMF module


Scale problem is a classical research topic for CNN, because it is not robust enough for the sizes of objects.Especially when the sizes of objects vary greatly, the plain topology model will encounter an awful performance.


尺度问题是CNN的一个经典研究课题,因为它对物体的大小不够鲁棒。特别是当对象的大小变化较大时,纯拓扑模型的性能会很差。


71612113013e4889bfd54ea0def2e61f.png


through multi-scale strategy, we design a RMF module combining the parameter-based and parameter-free methods.


通过多尺度策略,设计了基于参数和无参数相结合的RMF模块。


RMF module is a hierarchical structure for obtaining multi- scale contextual information.


RMF模块是一种用于获取多尺度上下文信息的分层结构。


which utilizes multiple maxpool operations with different sizes on input feature map. There are not any parameters introduced in this stage, hence we regard it as parameter-free.


在输入特征映射上利用多个不同大小的maxpool操作。由于此阶段未引入任何参数,因此我们认为它是无参数的。


Parameter-free method makes the most of existing data, but not generating new information in a sense.


无参数方法充分利用了现有数据,但在某种意义上不会产生新的信息


Dilated convolution can enhance the ability to extract un- derlying information through changing the receptive field [5].


扩张卷积可以通过改变接收野[5]来增强提取底层信息的能力。


If we use dilated convolution directly at the tail of backbone, it would be expensive on storage and computation.


如果直接在主干尾部使用扩张卷积,将会增加存储和计算的成本。


To address this problem, GDConv achieves dilation process based on a lighter form. Specifically, we retain the structure of original Ghost Conv but operate depthwise Conv with dilation version, and its inner detail is shown in Fig. 5.


为了解决这一问题,GDConv基于更轻的形式实现了膨胀过程。具体来说,我们保留了原来的Ghost Conv的结构,但对扩张版进行了深度Conv,其内部细节如图5所示。


e35385f6b99747dbbcd033576cc29d81.png


GDConv is the core ingredient for RMF module to learn implicit information through parameters of convolution kernels. Three GDConvs form the elements of a hierarchy group, and their dilation rates are set as 1, 5, 9 respectively.


GDConv是RMF模块通过卷积核参数学习隐式信息的核心组成部分。三个GDConvs组成一个层次组的元素,它们的膨胀率分别设为1、5、9。

Note that when dilation rate is 1, it is equivalent to normal Ghost Conv, and the new features from different dilation branches would be concatenated.


需要注意的是,当膨胀率为1时,它相当于正常的Ghost Conv,将不同膨胀分支的新特征串联起来。


the parameter-free method provides a multi-scale base through optimizing existing feature maps, and parameter- based method exploits new multi-scale data based on the former. Hence, the base and expansion pyramid of hierarchy have a superposition effect and enhance the ability to better develop effective representation.


无参数方法通过优化已有的特征图来提供多尺度的基础,而基于参数的方法则在前者的基础上利用新的多尺度数据。因此,层次的基础和扩展金字塔具有叠加效应,增强了更好地发展有效表征的能力。


C. The architecture of LF-YOLO


4b6ef65fd3e14c8aa3f2c519536e527f.png


V. CONCLUSION


In this paper, we propose a highly effective EFE module as the basic feature extraction block, and it can encode sufficient information of X-ray weld image with low consumption.


本文提出了一种高效的EFE模块作为基本特征提取块,该模块能够以较低的消耗对x射线焊缝图像进行足够的信息编码。


The parameter-free stage contributes to a basis containing existing multi-scale information, and parameter-based stage further learn implicit feature among different receptive fields.


无参数阶段形成包含已有多尺度信息的基础,基于参数阶段进一步学习不同接受域之间的隐式特征。


f31abac3856b422dab2238b4a7b0e81f.png

目录
相关文章
|
8月前
|
机器学习/深度学习 编解码 自然语言处理
Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation论文解读
在过去的几年中,卷积神经网络(CNN)在医学图像分析方面取得了里程碑式的进展。特别是基于U型结构和跳跃连接的深度神经网络在各种医学图像任务中得到了广泛的应用。
314 0
|
6月前
|
机器学习/深度学习 数据挖掘
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
30 1
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
|
8月前
|
机器学习/深度学习 编解码 自然语言处理
DeIT:Training data-efficient image transformers & distillation through attention论文解读
最近,基于注意力的神经网络被证明可以解决图像理解任务,如图像分类。这些高性能的vision transformer使用大量的计算资源来预训练了数亿张图像,从而限制了它们的应用。
237 0
|
8月前
|
机器学习/深度学习 编解码 自然语言处理
BEIT: BERT Pre-Training of Image Transformers论文解读
本文介绍了一种自监督视觉表示模型BEIT,即图像transformer的双向编码器表示。继自然语言处理领域开发的BERT之后
351 0
|
8月前
|
机器学习/深度学习 编解码 自然语言处理
FCT: The Fully Convolutional Transformer for Medical Image Segmentation 论文解读
我们提出了一种新的transformer,能够分割不同形态的医学图像。医学图像分析的细粒度特性所带来的挑战意味着transformer对其分析的适应仍处于初级阶段。
148 0
|
9月前
|
机器学习/深度学习 资源调度 数据可视化
【计算机视觉 | 目标检测】Detecting Twenty-thousand Classes using Image-level Supervision
本文提出的方法也采用了经典的两阶段范式,在第一阶段采用直接提取RPN的方法,第二阶段对做细化的具体类别进行assign和识别。
|
11月前
|
Go 计算机视觉 网络架构
Beyond YOLOv8!| YOLOv6 v3.0 is coming!
Beyond YOLOv8!| YOLOv6 v3.0 is coming!
127 0
|
11月前
|
机器学习/深度学习 编解码 并行计算
深度学习论文阅读目标检测篇(七)中文版:YOLOv4《Optimal Speed and Accuracy of Object Detection》
大多数基于 CNN 的目标检测器基本上都仅适用于推荐系统。例 如:通过城市摄像头寻找免费停车位,它由精确的慢速模型完成,而 汽车碰撞警报需要由快速、低精度模型完成。改善实时目标检测器的 精度,使其能够不仅可以用于提示生成推荐系统,也可以用于独立的 流程管理和减少人力投入。传统 GPU 使得目标检测可以以实惠的价 格运行。最准确的现代神经网络不是实时运行的,需要大量的训练的 GPU 与大的 mini bacth size。我们通过创建一个 CNN 来解决这样的 问题,在传统的 GPU 上进行实时操作,而对于这些训练只需要一个 传统的 GPU。
192 0
|
11月前
|
机器学习/深度学习 传感器 编解码
Spatial-Spectral Transformer for Hyperspectral Image Classification_外文翻译
 由于成像光谱学的进步,高光谱传感器倾向于以越来越高的空间和光谱分辨率捕获给定场景的反射强度[1]。获得的高光谱图像(HSI)同时包含空间特征和不同物体的连续诊断光谱[2]。因此,获得的丰富信息使HSI在许多领域有用,包括有效测量农业绩效[3]、植物病害检测[4]、矿物鉴定[5]、疾病诊断和图像引导手术[6]、生态系统测量[7],和地球监测[8]。为了充分利用获得的HSI,已经探索了许多数据处理技术,例如解混合、检测和分类[8]。
153 0
|
11月前
|
机器学习/深度学习 传感器 自然语言处理
论文笔记:SpectralFormer Rethinking Hyperspectral Image Classification With Transformers_外文翻译
 高光谱(HS)图像具有近似连续的光谱信息,能够通过捕获细微的光谱差异来精确识别物质。卷积神经网络(CNNs)由于具有良好的局部上下文建模能力,在HS图像分类中是一种强有力的特征提取器。然而,由于其固有的网络骨干网的限制,CNN不能很好地挖掘和表示谱特征的序列属性。
111 0