FCN算法的简介(论文介绍)
Faster-RCNN中,曾使用了RPN(Region Proposal Network)替代Selective Search等产生候选区域的方法,其中,RPN就是一种全卷积网络。FCN即Fully Convolutional Networks,该论文将CNN结构应用到图像语义分割领域,并取得突出结果,开山之作,获得CVPR 2015年的best paper honorable mention。
Abstract
Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Our key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves improved segmentation of PASCAL VOC (30% relative improvement to 67.2% mean IU on 2012), NYUDv2, SIFT Flow, and PASCAL-Context, while inference takes one tenth of a second for a typical image.
卷积网络是一种功能强大的可视化模型,它可以生成特性的层次结构。实验结果表明,卷积网络经过端到端、像素对像素的训练后,在语义分割方面优于已有的最佳分割效果。我们的核心理念是构建“全卷积”网络,它可以接受任意大小的输入,并通过高效的推理和学习产生相应大小的输出。我们定义并详细描述了全卷积网络的空间,解释了它们在空间密集预测任务中的应用,并将它们与之前的模型联系起来。我们将当代的分类网络(AlexNet、VGG net和GoogLeNet)改造成完全卷积的网络,并通过微调将它们的学习表示转移到分割任务中。然后,我们定义了一个skip架构,它结合了来自深度粗层的语义信息和来自深度细层的外观信息,从而生成精确而详细的分段。我们的全卷积网络实现了PASCAL VOC(相对于2012年的67.2% mean IU,提高了30%)、NYUDv2、SIFT Flow和PASCAL- context的分割,而对一个典型图像的推理需要十分之一秒。
CONCLUSION
Fully convolutional networks are a rich class of models that address many pixelwise tasks. FCNs for semantic segmentation dramatically improve accuracy by transferring pretrained classifier weights, fusing different layer representations, and learning end-to-end on whole images. End-toend, pixel-to-pixel operation simultaneously simplifies and speeds up learning and inference. All code for this paper is open source in Caffe, and all models are freely available in the Caffe Model Zoo. Further works have demonstrated the generality of fully convolutional networks for a variety of image-to-image tasks.
全卷积网络是一类丰富的模型,可以处理许多像素级的任务。FCNs通过传递预先训练的分类器权值,融合不同的层表示,对整个图像进行端到端学习,大大提高了语义分割的精度。端到端,像素对像素的操作同时简化和加快学习和推理。本文的所有代码都是Caffe中的开源代码,所有模型都可以在Caffe Model Zoo中免费获得。进一步的工作证明了全卷积网络对于各种图像到图像任务的通用性。
论文
Jonathan Long, Evan Shelhamer, Trevor Darrell.
Fully Convolutional Networks for Semantic Segmentation. CVPR 2015
https://arxiv.org/abs/1605.06211
0、实验结果
1、FCN的性能
图像分割的评价指标参考:CV之IS:计算机视觉之图像分割(Image Segmentation)算法的简介、使用方法、案例应用之详细攻略
FCN的基础CNN网络可以采用AlexNet、VGG16、GoogleNet等经典架构。
FCN的mean IU是最高的但是foreard time处理时间较长且conv.ayer的复杂度较高。
比较R-CNN和FCN-8s的测试时间,其中FCN-8s的mean IU高于其他两个网络。
2、跨层改善效果——比较是否采用跨层连接
第一张图没有采用跨层连接,即no skips(stride=32)分割的FCN,就比较粗糙了;第二张图采用skip=1的跨层连接(stride=16)的FCN有点改善了;第三张图采用skip=2的跨层连接的FCN效果更好一些。
1、全卷积神经网络的特点、局限性、缺点
1、FCN的特点
采用1×1卷积,替换全连接层,将CNN网络变成FCN(全卷积网络)。
采用跨层连接,引入底层特征补充上采样信息。
……
2、FCN的局限性
……
FCN算法的架构详解
DL之FCN:FCN算法的架构详解https://yunyaniu.blog.csdn.net/article/details/100060860