Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读

简介: Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读

目录

Generating Sequences With Recurrent Neural Networks

Abstract

1、Introduction

2 Prediction Network 预测网络

2.1 Long Short-Term Memory

3 Text Prediction  文本预测

3.1 Penn Treebank Experiments Penn Treebank实验

3.2 Wikipedia Experiments  维基百科的实验

4 Handwriting Prediction  笔迹的预测

4.1 Mixture Density Outputs  混合密度输出

4.2 Experiments

4.3 Samples  样品

5 Handwriting Synthesis  字合成

5.1 Synthesis Network  合成网络

5.2 Experiments  实验

5.3 Unbiased Sampling  公正的抽样

5.4 Biased Sampling  有偏见的抽样

5.5 Primed Sampling  启动采样

6 Conclusions and Future Work  结论与未来工作

Acknowledgements  致谢

References

Generating Sequences With Recurrent Neural Networks

利用递归神经网络生成序列


论文原文:Generating Sequences With Recurrent Neural Networks

作者:

Alex Graves   Department of Computer Science  

University of Toronto      graves@cs.toronto.edu

Abstract

image.png

1、Introduction 介绍

image.png

image.png

image.png

image.png

image.png

2 Prediction Network 预测网络

image.png

image.png

image.png

2.1 Long Short-Term Memory

image.png

image.png

3 Text Prediction  文本预测


image.png

image.png

3.1 Penn Treebank Experiments Penn Treebank实验

image.png

image.png

image.png

image.png

3.2 Wikipedia Experiments  维基百科的实验

image.png

image.png

image.png

image.png

image.png

4 Handwriting Prediction  笔迹的预测

image.png

image.png

image.png

4.1 Mixture Density Outputs  混合密度输出

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

4.3 Samples  样品

image.png

5 Handwriting Synthesis  字合成

image.png

image.png

image.png

image.png

5.1 Synthesis Network  合成网络

image.png

image.png

image.png

image.png

5.2 Experiments  实验

image.pngimage.png

5.3 Unbiased Sampling  公正的抽样

image.png

5.4 Biased Sampling  有偏见的抽样

image.png

image.png

image.png

5.5 Primed Sampling  启动采样

image.png

6 Conclusions and Future Work  结论与未来工作

image.png

Acknowledgements  致谢


image.png


Reference[1] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, March 1994.

[2] C. Bishop. Mixture density networks. Technical report, 1994.

[3] C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, Inc., 1995.

[4] N. Boulanger-Lewandowski, Y. Bengio, and P. Vincent. Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. In Proceedings of the Twenty-nine International Conference on Machine Learning (ICML’12), 2012.

[5] J. G. Cleary, Ian, and I. H. Witten. Data compression using adaptive coding and partial string matching. IEEE Transactions on Communications, 32:396–402, 1984.

[6] D. Eck and J. Schmidhuber. A first look at music composition using lstm recurrent neural networks. Technical report, IDSIA USI-SUPSI Instituto Dalle Molle.

[7] F. Gers, N. Schraudolph, and J. Schmidhuber. Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3:115–143, 2002.

[8] A. Graves. Practical variational inference for neural networks. In Advances in Neural Information Processing Systems, volume 24, pages 2348–2356. 2011.

[9] A. Graves. Sequence transduction with recurrent neural networks. In ICML Representation Learning Worksop, 2012.

[10] A. Graves, A. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proc. ICASSP, 2013.

[11] A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18:602–610, 2005.

[12] A. Graves and J. Schmidhuber. Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems, volume 21, 2008.

[13] P. D. Gr¨unwald. The Minimum Description Length Principle (Adaptive Computation and Machine Learning). The MIT Press, 2007.

[14] G. Hinton. A Practical Guide to Training Restricted Boltzmann Machines. Technical report, 2010.

[15] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-term Dependencies. In S. C. Kremer and J. F. Kolen, editors, A Field Guide to Dynamical Recurrent Neural Networks. 2001.

[16] S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735–1780, 1997.

[17] M. Hutter. The Human Knowledge Compression Contest, 2012. [18] K.-C. Jim, C. Giles, and B. Horne. An analysis of noise in recurrent neural networks: convergence and generalization. Neural Networks, IEEE Transactions on, 7(6):1424 –1438, 1996. [19] S. Johansson, R. Atwell, R. Garside, and G. Leech. The tagged LOB corpus user’s manual; Norwegian Computing Centre for the Humanities, 1986.

[20] B. Knoll and N. de Freitas. A machine learning perspective on predictive coding with paq. CoRR, abs/1108.3298, 2011.

[21] M. Liwicki and H. Bunke. IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In Proc. 8th Int. Conf. on Document Analysis and Recognition, volume 2, pages 956– 961, 2005.

[22] M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building a large annotated corpus of english: The penn treebank. COMPUTATIONAL LINGUISTICS, 19(2):313–330, 1993.

[23] T. Mikolov. Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology, 2012.

[24] T. Mikolov, I. Sutskever, A. Deoras, H. Le, S. Kombrink, and J. Cernocky. Subword language modeling with neural networks. Technical report, Unpublished Manuscript, 2012.

[25] A. Mnih and G. Hinton. A Scalable Hierarchical Distributed Language Model. In Advances in Neural Information Processing Systems, volume 21, 2008.

[26] A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. In Proceedings of the 29th International Conference on Machine Learning, pages 1751–1758, 2012.

[27] T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran. Lowrank matrix factorization for deep neural network training with highdimensional output targets. In Proc. ICASSP, 2013.

[28] M. Schuster. Better generative models for sequential data problems: Bidirectional recurrent mixture density networks. pages 589–595. The MIT Press, 1999.

[29] I. Sutskever, G. E. Hinton, and G. W. Taylor. The recurrent temporal restricted boltzmann machine. pages 1601–1608, 2008.

[30] I. Sutskever, J. Martens, and G. Hinton. Generating text with recurrent neural networks. In ICML, 2011.

[31] G. W. Taylor and G. E. Hinton. Factored conditional restricted boltzmann machines for modeling motion style. In Proc. 26th Annual International Conference on Machine Learning, pages 1025–1032, 2009.

[32] T. Tieleman and G. Hinton. Lecture 6.5 - rmsprop: Divide the gradient by a running average of its recent magnitude, 2012.

[33] R. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks and their computational complexity. In Back-propagation: Theory, Architectures and Applications, pages 433–486. 1995.






目录
打赏
0
0
0
0
1044
分享
相关文章
【文献学习】DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement
本文介绍了一种新的深度复数卷积递归网络(DCCRN),用于处理语音增强问题,特别是针对低模型复杂度的实时处理。
410 5
[BPE]论文实现:Neural Machine Translation of Rare Words with Subword Units
[BPE]论文实现:Neural Machine Translation of Rare Words with Subword Units
74 0
【论文阅读】(2019)SimGNN:A Neural Network Approach to Fast Graph Similarity Computation
- 图形相似性搜索是最重要的基于图形的应用程序之一,例如查找与查询化合物最相似的化合物。 - 图相似性距离计算,如图编辑距离(GED)和最大公共子图(MCS),是图相似性搜索和许多其他应用程序的核心操作,但实际计算成本很高。 - 受神经网络方法最近成功应用于若干图形应用(如节点或图形分类)的启发,我们提出了一种新的基于神经网络的方法来解决这一经典但具有挑战性的图形问题,**旨在减轻计算负担,同时保持良好的性能**。 - 提出的**方法称为SimGNN**,它结合了两种策略。 - 首先,我们**设计了一个可学习的嵌入函数**,将每个图映射到一个嵌入向量中,从而提供图的全局摘要。**提出了一种新的
341 0
【论文阅读】(2019)SimGNN:A Neural Network Approach to Fast Graph Similarity Computation
AI助理

你好,我是AI助理

可以解答问题、推荐解决方案等