GCN算法的简介(论文介绍)
作者在该论文中,强调了Large Kernel的重要性。
Abstract
One of recent trends [30, 31, 14] in network architecture design is stacking small filters (e.g., 1x1 or 3x3) in the entire network because the stacked small filters is more efficient than a large kernel, given the same computational complexity. However, in the field of semantic segmentation, where we need to perform dense per-pixel prediction, we find that the large kernel (and effective receptive field) plays an important role when we have to perform the classification and localization tasks simultaneously. Following our design principle, we propose a Global Convolutional Network to address both the classification and localization issues for the semantic segmentation. We also suggest a residual-based boundary refinement to further refine the object boundaries. Our approach achieves state-of-art performance on two public benchmarks and significantly outperforms previous results, 82.2% (vs 80.2%) on PASCAL VOC 2012 dataset and 76.9% (vs 71.8%) on Cityscapes dataset.
最近网络架构设计的一个趋势是在整个网络中堆叠小过滤器(例如1x1或3x3),因为在相同的计算复杂度下,堆叠小过滤器比大型内核更有效。然而,在语义分割领域,我们需要进行密集的逐像素预测,我们发现,当我们必须同时执行分类和定位任务时,大核(有效接受域)发挥着重要作用。根据我们的设计原则,我们提出了一个全局卷积网络来解决语义分割的分类和定位问题。我们还建议基于残差的边界细化来进一步细化对象边界。我们的方法在两个公共基准上实现了最先进的性能,并显著优于之前的结果,分别是PASCAL VOC 2012数据集的82.2% (vs . 80.2%)和Cityscapes数据集的76.9% (vs . 71.8%)。
Conclusion
According to our analysis on classification and segmentation, we find that large kernels is crucial to relieve the contradiction between classification and localization. Following the principle of large-size kernels, we propose the Global Convolutional Network. The ablation experiments show that our proposed structures meet a good trade-off between valid receptive field and the number of parameters, while achieves good performance. To further refine the object boundaries, we present a novel Boundary Refinement block. Qualitatively, our Global Convolutional Network mainly improve the internal regions while Boundary Refinement increase performance near boundaries. Our best model achieves state-of-the-art on two public benchmarks: PASCAL VOC 2012 (82.2%) and Cityscapes (76.9%).
通过对分类和分割的分析,我们发现大内核对于缓解分类和定位之间的矛盾至关重要。根据大内核的原理,我们提出了全球卷积网络。腐蚀实验表明,我们提出的结构在有效接受域和参数之间达到了很好的平衡,同时取得了较好的性能。为了进一步细化对象边界,我们提出了一种新的边界细化块。在质量上,我们的全局卷积网络主要是对内部区域进行改进,而边界细化则提高了边界附近的性能。我们最好的模型达到了最先进的两个公共基准:帕斯卡VOC 2012(82.2%)和城市景观(76.9%)。
论文
Chao Peng, XiangyuZhang, Gang Yu, GuimingLuo, Jian Sun.
Large Kernel Matters ——
Improve Semantic Segmentation by Global Convolutional Network. CVPR 2017.
https://arxiv.org/abs/1703.02719
0、实验结果
1、PASCAL VOC 2012 validation set
2、PASCAL VOC 2012和ImageNet
standard benchmark:PASCAL VOC 2012 and Cityscapes 标准基准:2012年PASCAL VOC和Cityscapes
ResNet152 (pretrained on ImageNet) as the base model for fine tuning. ResNet152(在ImageNet上预训练)作为微调的基本模型。
3、Examples of semantic segmentation results on PASCAL VOC 2012
4、Examples of semantic segmentation results on Cityscapes
GCN算法的架构详解
更新……
GCN算法的案例应用
更新……