Applied Deep Learning Resources

简介: Applied Deep Learning Resources A collection of research articles, blog posts, slides and code snippets about deep learning in applied settings.

 


Applied Deep Learning Resources

A collection of research articles, blog posts, slides and code snippets about deep learning in applied settings. Including trained models and simple methods that can be used out of the box. Mainly focusing on Convolutional Neural Networks (CNN) but Recurrent Neural Networks (RNN), deep Q-Networks (DQN) and other interesting architectures will also be listed.

CNN

Latest overview of the CNNs can be found from the paper "Deep learning for visual understanding: A review" [linkPDF]

Another decent overview in Nature by LeCun, Bengio and Hinton: "Deep learning" [linkPDF]

ImageNet

ImageNet is the most important image classification and localization competition. Other data sets with results can be found from here: "Discover the current state of the art in objects classification." [link].

imagenet-sample

Prediction error of the ImageNet competition has been decreasing rapidly over the last 5 years: imagenet-error

Main network architectures on ImageNet

AlexNet

Original paper: "ImageNet Classification with Deep Convolutional Neural Networks" [PDF]

Properties: 8 weight layers (5 convolutional and 2 fully connected), 60 million parameters, Rectified Linear Units (ReLUs), Local Response Normalization, Dropout

alexnet

VGG

Original paper: "Very Deep Convolutional Networks for Large-Scale Image Recognition" [arxiv]

Properties: 19 weight layers, 144m parameters, 3x3 convolution filters, L2 regularised, Dropout, No Local Response Normalization

GoogLeNet

Original paper: "Going deeper with convolutions" [arxiv]

Lates upgrade to the model achieves even better scores with models and import to Torch: "Rethinking the Inception Architecture for Computer Vision" [arxiv], "Torch port of Inception V3" [github]

Properties: 22 layers, 7m parameters, Inception modules, 1x1 conv layers, ReLUs, Dropout, Mid-level outputs

Inception modules:

googlenet

ResNet

Original paper: "Deep Residual Learning for Image Recognition" [arxiv]

Very nice slides: "Deep Residual Learning" [PDF]

Github: [github]

Properties: 152 layers, ReLUs, Batch Normalization (See "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" [arxiv]), less hacks (no dropout), more stable (different number of layers work as well) and lower complexity than VGG.

Main building block of the network:

resnet

Features are also very good and transferable with (faster) R-CNNs (see below):

resnet-features

Other architectures

  • Deep Learning for 3D shapes: "3D ShapeNets: A Deep Representation for Volumetric Shapes" [PDF]

  • Code and a model for faces: "Free and open source face recognition with deep neural networks." [github]

  • Fast neural networks which can perform arbitrary filters for images: "Deep Edge-Aware Filters" [PDF]

  • Lot's of different models in Caffe's "Model Zoo" [github]

Feature learning and object detection

  • "CNN Features off-the-shelf: an Astounding Baseline for Recognition" [arxiv]

  • First paper about R-CNN: "Rich feature hierarchies for accurate object detection and semantic segmentation" [PDFslides]

  • "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks" [arxivgithubSlides]

fast-rcnn

  • "An Empirical Evaluation of Deep Learning on Highway Driving" [arxiv]

emp

  • "Object Detectors Emerge in Deep Scene CNNs" [arxiv]

emergent-localisation

  • Faster and better features: "Efficient Deep Feature Learning and Extraction via StochasticNets" [arxiv]

Other

  • Code and models for automatic captions of images: "Deep Visual-Semantic Alignments for Generating Image Descriptions"[web posterPDFgithub]

captions

  • Google Deep Dream or neural networks on LSD: "Inceptionism: Going Deeper into Neural Networks" [link,deepdreamer.io/]

Deep dreaming from noise:

deepdream

nnstyle

  • "Automatic Colorization" and it includes a pre-trained model [Link]

color

  • "Learning visual similarity for product design with convolutional neural networks" [PDF]

products

  • Using images and image descriptions to improve search results: "Images Don’t Lie: Transferring Deep Visual Semantic Features to Large-Scale Multimodal Learning to Rank" [arxiv]

  • "How Google Translate squeezes deep learning onto a phone" [post]

phone-nn

  • "What a Deep Neural Network thinks about your #selfie" [blog]

Top selfies according to the ConvNet:

topselfies

  • "Recommending music on Spotify with deep learning" [github]

  • "DeepStereo: Learning to Predict New Views from the World's Imagery" [arxiv]

deepstereo

  • Classifying street signs: "The power of Spatial Transformer Networks" [blog] with "Spatial Transformer Networks" [arxiv]

spatial-nn

  • "Pedestrian Detection with RCNN" [PDF]

DQN

  • Original paper: "Playing Atari with Deep Reinforcement Learning" [arxiv]

  • My popular science article about DQN: "Artificial General Intelligence that plays Atari video games: How did DeepMind do it?" [link]

  • DQN for RoboCup: "Deep Reinforcement Learning in Parameterized Action Space" [arxiv]

RNN

  • Original paper of the best RNN architecture: "Long short-term memory" [PDF]

  • Very good tutorial-like introduction to RNNs by Andrej Karpathy: "The Unreasonable Effectiveness of Recurrent Neural Networks" [link]

  • "Visualizing and Understanding Recurrent Networks" [arxiv]

  • "Composing Music With Recurrent Neural Networks" [blog]

Other promising or useful architectures

  • HTMs by Jeff Hawkins: "Continuous online sequence learning with an unsupervised neural network model"​ [arxiv]

  • Word2vec: "Efficient Estimation of Word Representations in Vector Space" [arxivGoogle code]

  • "Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency" [arxiv]

Framework benchmarks

  • "Comparative Study of Caffe, Neon, Theano and Torch for deep learning" [arxiv]

Their summary: From our experiments, we observe that Theano and Torch are the most easily extensible frameworks. We observe that Torch is best suited for any deep architecture on CPU, followed by Theano. It also achieves the best performance on the GPU for large convolutional and fully connected networks, followed closely by Neon. Theano achieves the best performance on GPU for training and deployment of LSTM networks. Finally Caffe is the easiest for evaluating the performance of standard deep architectures.

  • Very good qualitative analysis: zer0n/deepframeworks: [github]

  • Just performance comparison: soumith/convnet-benchmarks: [github]

  • "Deep Learning Libraries by Language" [link]

Other resources

Credits

Most of the snippets have come to my attention via internal mailing lists of Computational Neuroscience Lab at University of Tartu and London-based visual search company Dream It Get It. I am also reading a weekly newsletter by Data Elixir and checking research papers of the two main deep learning conferences: ICML and NIPS.

 
相关文章
|
10天前
|
人工智能 JavaScript Linux
【Claude Code 全攻略】终端AI编程助手从入门到进阶(2026最新版)
Claude Code是Anthropic推出的终端原生AI编程助手,支持40+语言、200k超长上下文,无需切换IDE即可实现代码生成、调试、项目导航与自动化任务。本文详解其安装配置、四大核心功能及进阶技巧,助你全面提升开发效率,搭配GitHub Copilot使用更佳。
|
4天前
|
JSON API 数据格式
OpenCode入门使用教程
本教程介绍如何通过安装OpenCode并配置Canopy Wave API来使用开源模型。首先全局安装OpenCode,然后设置API密钥并创建配置文件,最后在控制台中连接模型并开始交互。
1852 6
|
11天前
|
存储 人工智能 自然语言处理
OpenSpec技术规范+实例应用
OpenSpec 是面向 AI 智能体的轻量级规范驱动开发框架,通过“提案-审查-实施-归档”工作流,解决 AI 编程中的需求偏移与不可预测性问题。它以机器可读的规范为“单一真相源”,将模糊提示转化为可落地的工程实践,助力开发者高效构建稳定、可审计的生产级系统,实现从“凭感觉聊天”到“按规范开发”的跃迁。
1887 18
|
10天前
|
人工智能 JavaScript 前端开发
【2026最新最全】一篇文章带你学会Cursor编程工具
本文介绍了Cursor的下载安装、账号注册、汉化设置、核心模式(Agent、Plan、Debug、Ask)及高阶功能,如@引用、@Doc文档库、@Browser自动化和Rules规则配置,助力开发者高效使用AI编程工具。
1343 7
|
11天前
|
消息中间件 人工智能 Kubernetes
阿里云云原生应用平台岗位急招,加入我们,打造 AI 最强基础设施
云原生应用平台作为中国最大云计算公司的基石,现全面转向 AI,打造 AI 时代最强基础设施。寻找热爱技术、具备工程极致追求的架构师、极客与算法专家,共同重构计算、定义未来。杭州、北京、深圳、上海热招中,让我们一起在云端,重构 AI 的未来。
|
14天前
|
IDE 开发工具 C语言
【2026最新】VS2026下载安装使用保姆级教程(附安装包+图文步骤)
Visual Studio 2026是微软推出的最新Windows专属IDE,启动更快、内存占用更低,支持C++、Python等开发。推荐免费的Community版,安装简便,适合初学者与个人开发者使用。
1346 13
|
9天前
|
人工智能 JSON 自然语言处理
【2026最新最全】一篇文章带你学会Qoder编辑器
Qoder是一款面向程序员的AI编程助手,集智能补全、对话式编程、项目级理解、任务模式与规则驱动于一体,支持模型分级选择与CLI命令行操作,可自动生成文档、优化提示词,提升开发效率。
817 10
【2026最新最全】一篇文章带你学会Qoder编辑器
|
14天前
|
人工智能 测试技术 开发者
AI Coding后端开发实战:解锁AI辅助编程新范式
本文系统阐述了AI时代开发者如何高效协作AI Coding工具,强调破除认知误区、构建个人上下文管理体系,并精准判断AI输出质量。通过实战流程与案例,助力开发者实现从编码到架构思维的跃迁,成为人机协同的“超级开发者”。
1100 96
|
8天前
|
云安全 安全
免费+限量+领云小宝周边!「阿里云2026云上安全健康体检」火热进行中!
诚邀您进行年度自检,发现潜在风险,守护云上业务连续稳健运行
1182 2

热门文章

最新文章