Faster R-CNN算法的简介(论文介绍)
Faster R-CNN,顾名思义,相对R-CNN有非常大的提高!
Abstract
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features—using the recently popular terminology of neural networks with “attention” mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.
摘要
最先进的目标检测网络依赖于区域建议算法来假设目标位置。SPPnet[1]和Fast R-CNN[2]等技术的进步,降低了检测网络的运行时间,暴露了区域提案计算的瓶颈。在这项工作中,我们引入了一个与检测网络共享全图像卷积特性的区域建议网络(RPN),从而实现了几乎免费的区域建议。RPN是一个完全卷积的网络,它同时预测每个位置的对象边界和对象得分。对RPN进行端到端训练,生成高质量的区域建议,Fast R-CNN对其进行检测。通过共享卷积特性,我们进一步将RPN和Fast R-CNN合并成一个单独的网络——使用最近流行的具有“注意”机制的神经网络术语,RPN组件告诉统一的网络去哪里看。对于非常深的VGG-16型号[3],我们的检测系统在GPU上的帧率为5fps(包括所有步骤),同时在PASCAL VOC 2007、2012和MS COCO数据集上实现了最先进的目标检测精度,每张图像只有300个提案。在ILSVRC和COCO 2015年的比赛中,Faster R-CNN和RPN是在多个赛道上获得第一名的基础。代码已经公开。
CONCLUSION
We have presented RPNs for efficient and accurate region proposal generation. By sharing convolutional features with the down-stream detection network, the region proposal step is nearly cost-free. Our method enables a unified, deep-learning-based object detection system to run at near real-time frame rates. The learned RPN also improves region proposal quality and thus the overall object detection accuracy.
结论
为了高效、准确地生成区域建议,我们提出了一种新的区域建议生成方法。通过与下游检测网络共享卷积特性,区域建议步骤几乎是免费的。我们的方法使一个统一的,基于深度学习的目标检测系统运行在接近实时帧率。学习的RPN还提高了区域建议质量,从而提高了总体目标检测精度。
论文
Shaoqing Ren, KaimingHe, Ross Girshick, and Jian Sun.
Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS, 2015
https://arxiv.org/abs/1506.01497v3
1、实验结果
1、PASCAL VOC 2007
Example detections using RPN proposals on PASCAL VOC 2007 test. 下图为在PASCAL VOC 2007测试中,使用RPN进行目标检测的结果。The proposed method detects objects in a wide range of scales and aspect ratios. 该方法检测的目标对象,具有较宽的尺度和宽高比。
Detection results on PASCAL VOC 2007 test set
SS指采用选择性搜索但没有采用RPN的网络;unshared是指没有共享特征的网络。
RPN+VGG+shared能够得到最好的结果!
2、PASCAL VOC 2012
Detection results on PASCAL VOC 2012 test set
RPN+VGG+shared能够得到最好的结果!
测试的速度:VGG+SS+Fast R-CNN来说,每秒0.5帧,即处理一帧(幅图像)大概需要2秒。
VGG+RPN+Fast R-CNN来说,处理一帧(幅图像)大概需要0.2秒。
ZF网络更快,每秒17帧(图像),