CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第五章~第八章(一)

简介: CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第五章~第八章

5 Applications of CNN


CNN has been successfully applied to different ML related tasks, namely object detection, recognition, classification, regression, segmentation, etc [169]–[171]. However, CNN generally needs a large amount of data for learning. All of the aforementioned areas in which CNN has shown tremendous success have relatively abundant labeled data, such as traffic sign recognition, segmentation of medical images, and the detection of faces, text, pedestrians, and human in natural images. Some of the interesting applications of CNN are discussed below.  


5.1 Natural Language Processing


Natural Language Processing (NLP) converts language into a presentation that can easily be exploited by any computer. CNN has been utilized in NLP based applications such as speech recognition, language modeling, and analysis etc. Especially, language modeling or sentence molding has taken a twist after the introduction of CNN as a new representation-learning algorithm. Sentence modeling is performed to know semantics of the sentences and thus offer new and appealing applications according to customer requirements. Traditional methods of information retrieval analyze data, based on words or features, but ignore the core of the sentence. In [172], the authors use a dynamic CNN and dynamic k-max pooling during training. This approach finds the relations between words without taking into account any external source like parser or vocabulary. In a similar way, collobert et al. [173] proposed CNN based architecture that can perform various MLP related tasks at the same time like chunking, language modeling, recognizing name-entity, and role modeling related to semantics. In another work, Hu et al. proposed a generic CNN based architecture that performs matching between two sentences and thus can be applied to different languages [174].  


5.2 Computer Vision related Applications


Computer vision (CV) focuses to develop artificial system that can process visual data including images and videos and can effectively understand and extract useful information form it. CV includes number of areas such as face recognition, pose estimation, activity recognition, etc.  

Face recognition is one of the difficult tasks in CV. The recent research on face recognition is working to cope with the challenges that put the original image into big variations even when they do not exist in reality. This variation is caused by illumination, change in pose, and different facial expressions. Farfade et al. [175] proposed deep CNN for detecting face from different pose and also able to recognize occluded faces. In another work, Zhang et al. [176] performed face detection using a new type of multitask cascaded CNN. Zhang’s technique showed good results when comparison is shown against latest state-of-the-art techniques [177]–[179].  

Human pose estimation is one of the challenging task related to CV because of the high variability in body pose. Li et al. [180] proposed a heterogeneous deep CNN based pose estimation related technique. In Li’s technique, empirical results have shown that the hidden neurons are able to learn the localized part of the body. Similarly, another cascade based CNN technique is proposed by Bulat et al. [181]. In their cascaded architecture, first heat maps are detected, whereas, in the second phase, regression is performed on the detected heat maps.  

Action recognition is one of the important areas of activity recognition. The difficulties in developing an action recognition system are to solve the translations and distortions of features in different patterns, which belong to the same action class. Earlier approaches involved the construction of motion history images, use of Hidden Markov Models, action sketch generation, etc. Recently, Wang et al. [182] proposed a three dimensional CNN architecture in combination with LSTM for recognizing different actions from video frames. Experimental results have shown that Wang’s technique outperforms the latest activity recognition based techniques [183]– [187]. Similarly, another three dimensional CNN based action recognition system is proposed by Ji et al. [188]. In Ji’s work, three-dimensional CNN is used to extract features from multiple channels of input frames. The final action recognition based model is developed on combined extracted feature space. The proposed three dimensional CNN model is trained in a supervised way and is able to perform activity recognition in real world applications.


5.3 Object Detection


Object detection focuses on identifying different objects in images. Recently, region-based CNN (R-CNN) has been widely used for object detection. Ren et al. (2015) proposed an improvement over R-CNN named as fast R-CNN for object detection [189]. In their work fully convolutional neural network is used to extract feature space that can simultaneously detect boundary and score of object located at different positions. Similarly, Dai et al. (2016) proposed region-based object detection using fully connected CNN [190]. In Dai’s work, results are reported on the PASCAL VOC image dataset. Another object detection technique is reported by Gidaris et al. [191], which is based on multi-region based deep CNN that helps to learn the semantic aware features. In Gidaris’s approach, objects are detected with high accuracy on PASCAL VOC 2007 and 2012 dataset.  


5.4 Image Classification


CNN has been widely used for image classification [192]–[194]. One of the major applications of CNN is in medical images especially, for diagnoses of cancer using histopathological images [195]. Recently, Spanhol et al. (2016) used CNN for the diagnosis of breast cancer images and results are compared against a network trained on a dataset containing handcrafted descriptors [196], [197]. Another recently proposed CNN based technique for breast cancer diagnosis is developed by Wahab et al. [198]. In Wahab’s work, two phases are involved. In the first phase, hard non-mitosis examples are identified. Whereas, in second phase data augmentation is performed to cope with the class skewness problem. Similarly, Ciresan et al. [96] used German benchmark dataset related to traffic sign signal. They designed CNN based architecture that performed traffic sign classification related task with good recognition rate.  

 


相关文章
|
机器学习/深度学习 PyTorch 测试技术
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation 论文解读
我们提出了SegNeXt,一种用于语义分割的简单卷积网络架构。最近的基于transformer的模型由于在编码空间信息时self-attention的效率而主导了语义分割领域。在本文中,我们证明卷积注意力是比transformer中的self-attention更有效的编码上下文信息的方法。
402 0
|
机器学习/深度学习 存储 编解码
【论文泛读】轻量化之MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications(上)
【论文泛读】轻量化之MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
【论文泛读】轻量化之MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications(上)
|
机器学习/深度学习 编解码 固态存储
【论文泛读】轻量化之MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications(下)
【论文泛读】轻量化之MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications(下)
【论文泛读】轻量化之MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications(下)
|
机器学习/深度学习 搜索推荐
【推荐系统论文精读系列】(十四)--Information Fusion-Based Deep Neural Attentive Matrix Factorization Recommendation
推荐系统的出现,有效地缓解了信息过载的问题。而传统的推荐系统,要么忽略用户和物品的丰富属性信息,如用户的人口统计特征、物品的内容特征等,面对稀疏性问题,要么采用全连接网络连接特征信息,忽略不同属性信息之间的交互。本文提出了基于信息融合的深度神经注意矩阵分解(ifdnamf)推荐模型,该模型引入了用户和物品的特征信息,并采用不同信息域之间的交叉积来学习交叉特征。此外,还利用注意机制来区分不同交叉特征对预测结果的重要性。此外,ifdnamf采用深度神经网络来学习用户与项目之间的高阶交互。同时,作者在电影和图书这两个数据集上进行了广泛的实验,并证明了该模型的可行性和有效性。
298 0
【推荐系统论文精读系列】(十四)--Information Fusion-Based Deep Neural Attentive Matrix Factorization Recommendation
|
机器学习/深度学习 人工智能 搜索推荐
【推荐系统论文精读系列】(十五)--Examples-Rules Guided Deep Neural Network for Makeup Recommendation
在本文中,我们考虑了一个全自动补妆推荐系统,并提出了一种新的例子-规则引导的深度神经网络方法。该框架由三个阶段组成。首先,将与化妆相关的面部特征进行结构化编码。其次,这些面部特征被输入到示例中——规则引导的深度神经推荐模型,该模型将Before-After图像和化妆师知识两两结合使用。
164 0
【推荐系统论文精读系列】(十五)--Examples-Rules Guided Deep Neural Network for Makeup Recommendation
|
机器学习/深度学习 文字识别 并行计算
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章(三)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章(三)
|
机器学习/深度学习 数据挖掘 计算机视觉
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章(二)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章(二)
|
机器学习/深度学习 人工智能 编解码
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章(一)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章(一)
|
机器学习/深度学习 数据挖掘 计算机视觉
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(一)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(一)
|
机器学习/深度学习 人工智能 数据挖掘
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第五章~第八章(二)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第五章~第八章
下一篇
无影云桌面