Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读

简介: Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读

目录

Generating Sequences With Recurrent Neural Networks

Abstract

1、Introduction

2 Prediction Network 预测网络

2.1 Long Short-Term Memory

3 Text Prediction  文本预测

3.1 Penn Treebank Experiments Penn Treebank实验

3.2 Wikipedia Experiments  维基百科的实验

4 Handwriting Prediction  笔迹的预测

4.1 Mixture Density Outputs  混合密度输出

4.2 Experiments

4.3 Samples  样品

5 Handwriting Synthesis  字合成

5.1 Synthesis Network  合成网络

5.2 Experiments  实验

5.3 Unbiased Sampling  公正的抽样

5.4 Biased Sampling  有偏见的抽样

5.5 Primed Sampling  启动采样

6 Conclusions and Future Work  结论与未来工作

Acknowledgements  致谢

References

Generating Sequences With Recurrent Neural Networks

利用递归神经网络生成序列


论文原文:Generating Sequences With Recurrent Neural Networks

作者:

Alex Graves   Department of Computer Science  

University of Toronto      graves@cs.toronto.edu

Abstract

image.png

1、Introduction 介绍

image.png

image.png

image.png

image.png

image.png

2 Prediction Network 预测网络

image.png

image.png

image.png

2.1 Long Short-Term Memory

image.png

image.png

3 Text Prediction  文本预测


image.png

image.png

3.1 Penn Treebank Experiments Penn Treebank实验

image.png

image.png

image.png

image.png

3.2 Wikipedia Experiments  维基百科的实验

image.png

image.png

image.png

image.png

image.png

4 Handwriting Prediction  笔迹的预测

image.png

image.png

image.png

4.1 Mixture Density Outputs  混合密度输出

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

4.3 Samples  样品

image.png

5 Handwriting Synthesis  字合成

image.png

image.png

image.png

image.png

5.1 Synthesis Network  合成网络

image.png

image.png

image.png

image.png

5.2 Experiments  实验

image.pngimage.png

5.3 Unbiased Sampling  公正的抽样

image.png

5.4 Biased Sampling  有偏见的抽样

image.png

image.png

image.png

5.5 Primed Sampling  启动采样

image.png

6 Conclusions and Future Work  结论与未来工作

image.png

Acknowledgements  致谢


image.png


Reference[1] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, March 1994.

[2] C. Bishop. Mixture density networks. Technical report, 1994.

[3] C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, Inc., 1995.

[4] N. Boulanger-Lewandowski, Y. Bengio, and P. Vincent. Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. In Proceedings of the Twenty-nine International Conference on Machine Learning (ICML’12), 2012.

[5] J. G. Cleary, Ian, and I. H. Witten. Data compression using adaptive coding and partial string matching. IEEE Transactions on Communications, 32:396–402, 1984.

[6] D. Eck and J. Schmidhuber. A first look at music composition using lstm recurrent neural networks. Technical report, IDSIA USI-SUPSI Instituto Dalle Molle.

[7] F. Gers, N. Schraudolph, and J. Schmidhuber. Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3:115–143, 2002.

[8] A. Graves. Practical variational inference for neural networks. In Advances in Neural Information Processing Systems, volume 24, pages 2348–2356. 2011.

[9] A. Graves. Sequence transduction with recurrent neural networks. In ICML Representation Learning Worksop, 2012.

[10] A. Graves, A. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proc. ICASSP, 2013.

[11] A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18:602–610, 2005.

[12] A. Graves and J. Schmidhuber. Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems, volume 21, 2008.

[13] P. D. Gr¨unwald. The Minimum Description Length Principle (Adaptive Computation and Machine Learning). The MIT Press, 2007.

[14] G. Hinton. A Practical Guide to Training Restricted Boltzmann Machines. Technical report, 2010.

[15] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-term Dependencies. In S. C. Kremer and J. F. Kolen, editors, A Field Guide to Dynamical Recurrent Neural Networks. 2001.

[16] S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735–1780, 1997.

[17] M. Hutter. The Human Knowledge Compression Contest, 2012. [18] K.-C. Jim, C. Giles, and B. Horne. An analysis of noise in recurrent neural networks: convergence and generalization. Neural Networks, IEEE Transactions on, 7(6):1424 –1438, 1996. [19] S. Johansson, R. Atwell, R. Garside, and G. Leech. The tagged LOB corpus user’s manual; Norwegian Computing Centre for the Humanities, 1986.

[20] B. Knoll and N. de Freitas. A machine learning perspective on predictive coding with paq. CoRR, abs/1108.3298, 2011.

[21] M. Liwicki and H. Bunke. IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In Proc. 8th Int. Conf. on Document Analysis and Recognition, volume 2, pages 956– 961, 2005.

[22] M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building a large annotated corpus of english: The penn treebank. COMPUTATIONAL LINGUISTICS, 19(2):313–330, 1993.

[23] T. Mikolov. Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology, 2012.

[24] T. Mikolov, I. Sutskever, A. Deoras, H. Le, S. Kombrink, and J. Cernocky. Subword language modeling with neural networks. Technical report, Unpublished Manuscript, 2012.

[25] A. Mnih and G. Hinton. A Scalable Hierarchical Distributed Language Model. In Advances in Neural Information Processing Systems, volume 21, 2008.

[26] A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. In Proceedings of the 29th International Conference on Machine Learning, pages 1751–1758, 2012.

[27] T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran. Lowrank matrix factorization for deep neural network training with highdimensional output targets. In Proc. ICASSP, 2013.

[28] M. Schuster. Better generative models for sequential data problems: Bidirectional recurrent mixture density networks. pages 589–595. The MIT Press, 1999.

[29] I. Sutskever, G. E. Hinton, and G. W. Taylor. The recurrent temporal restricted boltzmann machine. pages 1601–1608, 2008.

[30] I. Sutskever, J. Martens, and G. Hinton. Generating text with recurrent neural networks. In ICML, 2011.

[31] G. W. Taylor and G. E. Hinton. Factored conditional restricted boltzmann machines for modeling motion style. In Proc. 26th Annual International Conference on Machine Learning, pages 1025–1032, 2009.

[32] T. Tieleman and G. Hinton. Lecture 6.5 - rmsprop: Divide the gradient by a running average of its recent magnitude, 2012.

[33] R. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks and their computational complexity. In Back-propagation: Theory, Architectures and Applications, pages 433–486. 1995.






相关文章
|
7月前
|
Python
[Knowledge Distillation]论文分析:Distilling the Knowledge in a Neural Network
[Knowledge Distillation]论文分析:Distilling the Knowledge in a Neural Network
46 1
|
7月前
|
机器学习/深度学习 自然语言处理 算法
[BPE]论文实现:Neural Machine Translation of Rare Words with Subword Units
[BPE]论文实现:Neural Machine Translation of Rare Words with Subword Units
53 0
|
机器学习/深度学习 存储 人工智能
【文本分类】Recurrent Convolutional Neural Networks for Text Classification
【文本分类】Recurrent Convolutional Neural Networks for Text Classification
109 0
【文本分类】Recurrent Convolutional Neural Networks for Text Classification
|
机器学习/深度学习 自然语言处理 算法
【文本分类】Convolutional Neural Networks for Sentence Classification
【文本分类】Convolutional Neural Networks for Sentence Classification
102 0
【文本分类】Convolutional Neural Networks for Sentence Classification
|
机器学习/深度学习
Re22:读论文 HetSANN An Attention-based Graph Neural Network for Heterogeneous Structural Learning
Re22:读论文 HetSANN An Attention-based Graph Neural Network for Heterogeneous Structural Learning
Re22:读论文 HetSANN An Attention-based Graph Neural Network for Heterogeneous Structural Learning
|
机器学习/深度学习 数据建模
2_Recurrent Neural Networks (RNNs)循环神经网络 —Simple RNNs
2_Recurrent Neural Networks (RNNs)循环神经网络 —Simple RNNs
200 0
2_Recurrent Neural Networks (RNNs)循环神经网络 —Simple RNNs
|
机器学习/深度学习 数据挖掘 计算机视觉
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(一)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(一)
|
机器学习/深度学习 传感器 文字识别
Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读(三)
Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读
|
机器学习/深度学习 XML 缓存
Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读(二)
Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读
|
机器学习/深度学习 存储 文字识别
Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读(一)
Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读