Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读

简介: Paper:《Generating Sequences With Recurrent Neural Networks》的翻译和解读

目录

Generating Sequences With Recurrent Neural Networks

Abstract

1、Introduction

2 Prediction Network 预测网络

2.1 Long Short-Term Memory

3 Text Prediction  文本预测

3.1 Penn Treebank Experiments Penn Treebank实验

3.2 Wikipedia Experiments  维基百科的实验

4 Handwriting Prediction  笔迹的预测

4.1 Mixture Density Outputs  混合密度输出

4.2 Experiments

4.3 Samples  样品

5 Handwriting Synthesis  字合成

5.1 Synthesis Network  合成网络

5.2 Experiments  实验

5.3 Unbiased Sampling  公正的抽样

5.4 Biased Sampling  有偏见的抽样

5.5 Primed Sampling  启动采样

6 Conclusions and Future Work  结论与未来工作

Acknowledgements  致谢

References

Generating Sequences With Recurrent Neural Networks

利用递归神经网络生成序列


论文原文:Generating Sequences With Recurrent Neural Networks

作者:

Alex Graves   Department of Computer Science  

University of Toronto      graves@cs.toronto.edu

Abstract

image.png

1、Introduction 介绍

image.png

image.png

image.png

image.png

image.png

2 Prediction Network 预测网络

image.png

image.png

image.png

2.1 Long Short-Term Memory

image.png

image.png

3 Text Prediction  文本预测


image.png

image.png

3.1 Penn Treebank Experiments Penn Treebank实验

image.png

image.png

image.png

image.png

3.2 Wikipedia Experiments  维基百科的实验

image.png

image.png

image.png

image.png

image.png

4 Handwriting Prediction  笔迹的预测

image.png

image.png

image.png

4.1 Mixture Density Outputs  混合密度输出

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

4.3 Samples  样品

image.png

5 Handwriting Synthesis  字合成

image.png

image.png

image.png

image.png

5.1 Synthesis Network  合成网络

image.png

image.png

image.png

image.png

5.2 Experiments  实验

image.pngimage.png

5.3 Unbiased Sampling  公正的抽样

image.png

5.4 Biased Sampling  有偏见的抽样

image.png

image.png

image.png

5.5 Primed Sampling  启动采样

image.png

6 Conclusions and Future Work  结论与未来工作

image.png

Acknowledgements  致谢


image.png


Reference[1] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, March 1994.

[2] C. Bishop. Mixture density networks. Technical report, 1994.

[3] C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, Inc., 1995.

[4] N. Boulanger-Lewandowski, Y. Bengio, and P. Vincent. Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. In Proceedings of the Twenty-nine International Conference on Machine Learning (ICML’12), 2012.

[5] J. G. Cleary, Ian, and I. H. Witten. Data compression using adaptive coding and partial string matching. IEEE Transactions on Communications, 32:396–402, 1984.

[6] D. Eck and J. Schmidhuber. A first look at music composition using lstm recurrent neural networks. Technical report, IDSIA USI-SUPSI Instituto Dalle Molle.

[7] F. Gers, N. Schraudolph, and J. Schmidhuber. Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3:115–143, 2002.

[8] A. Graves. Practical variational inference for neural networks. In Advances in Neural Information Processing Systems, volume 24, pages 2348–2356. 2011.

[9] A. Graves. Sequence transduction with recurrent neural networks. In ICML Representation Learning Worksop, 2012.

[10] A. Graves, A. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proc. ICASSP, 2013.

[11] A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18:602–610, 2005.

[12] A. Graves and J. Schmidhuber. Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems, volume 21, 2008.

[13] P. D. Gr¨unwald. The Minimum Description Length Principle (Adaptive Computation and Machine Learning). The MIT Press, 2007.

[14] G. Hinton. A Practical Guide to Training Restricted Boltzmann Machines. Technical report, 2010.

[15] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-term Dependencies. In S. C. Kremer and J. F. Kolen, editors, A Field Guide to Dynamical Recurrent Neural Networks. 2001.

[16] S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735–1780, 1997.

[17] M. Hutter. The Human Knowledge Compression Contest, 2012. [18] K.-C. Jim, C. Giles, and B. Horne. An analysis of noise in recurrent neural networks: convergence and generalization. Neural Networks, IEEE Transactions on, 7(6):1424 –1438, 1996. [19] S. Johansson, R. Atwell, R. Garside, and G. Leech. The tagged LOB corpus user’s manual; Norwegian Computing Centre for the Humanities, 1986.

[20] B. Knoll and N. de Freitas. A machine learning perspective on predictive coding with paq. CoRR, abs/1108.3298, 2011.

[21] M. Liwicki and H. Bunke. IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In Proc. 8th Int. Conf. on Document Analysis and Recognition, volume 2, pages 956– 961, 2005.

[22] M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building a large annotated corpus of english: The penn treebank. COMPUTATIONAL LINGUISTICS, 19(2):313–330, 1993.

[23] T. Mikolov. Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology, 2012.

[24] T. Mikolov, I. Sutskever, A. Deoras, H. Le, S. Kombrink, and J. Cernocky. Subword language modeling with neural networks. Technical report, Unpublished Manuscript, 2012.

[25] A. Mnih and G. Hinton. A Scalable Hierarchical Distributed Language Model. In Advances in Neural Information Processing Systems, volume 21, 2008.

[26] A. Mnih and Y. W. Teh. A fast and simple algorithm for training neural probabilistic language models. In Proceedings of the 29th International Conference on Machine Learning, pages 1751–1758, 2012.

[27] T. N. Sainath, A. Mohamed, B. Kingsbury, and B. Ramabhadran. Lowrank matrix factorization for deep neural network training with highdimensional output targets. In Proc. ICASSP, 2013.

[28] M. Schuster. Better generative models for sequential data problems: Bidirectional recurrent mixture density networks. pages 589–595. The MIT Press, 1999.

[29] I. Sutskever, G. E. Hinton, and G. W. Taylor. The recurrent temporal restricted boltzmann machine. pages 1601–1608, 2008.

[30] I. Sutskever, J. Martens, and G. Hinton. Generating text with recurrent neural networks. In ICML, 2011.

[31] G. W. Taylor and G. E. Hinton. Factored conditional restricted boltzmann machines for modeling motion style. In Proc. 26th Annual International Conference on Machine Learning, pages 1025–1032, 2009.

[32] T. Tieleman and G. Hinton. Lecture 6.5 - rmsprop: Divide the gradient by a running average of its recent magnitude, 2012.

[33] R. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks and their computational complexity. In Back-propagation: Theory, Architectures and Applications, pages 433–486. 1995.






目录
打赏
0
0
0
0
1045
分享
相关文章
dataframe获取指定列
dataframe获取指定列
984 0
采用zookeeper的EPHEMERAL节点机制实现服务集群的陷阱
在集群管理中使用Zookeeper的EPHEMERAL节点机制存在很多的陷阱,毛估估,第一次使用zk来实现集群管理的人应该有80%以上会掉坑,有些坑比较隐蔽,在网络问题或者异常的场景时才会出现,可能很长一段时间才会暴露出来。
14715 1
Docker【部署 07】镜像内安装tensorflow-gpu及调用GPU多个问题处理Could not find cuda drivers+unable to find libcuda.so...
Docker【部署 07】镜像内安装tensorflow-gpu及调用GPU多个问题处理Could not find cuda drivers+unable to find libcuda.so...
1090 0
智能化运维:提升IT服务效率的新引擎###
本文深入浅出地探讨了智能化运维(AIOps)如何革新传统IT运维模式,通过大数据、机器学习与自动化技术,实现故障预警、快速定位与处理,从而显著提升IT服务的稳定性和效率。不同于传统运维依赖人工响应,AIOps强调预测性维护与自动化流程,为企业数字化转型提供强有力的支撑。 ###
Opencv学习笔记(十一):opencv通过mp4保存为H.264视频
本文介绍了如何在OpenCV中通过使用cisco开源的openh264库来解决不支持H.264编码的问题,并提供了完整的代码示例。
688 0
Opencv学习笔记(十一):opencv通过mp4保存为H.264视频
|
11月前
Anaconda——添加清华源
Anaconda——添加清华源
420 0
Dask与Pandas:无缝迁移至分布式数据框架
【8月更文第29天】Pandas 是 Python 社区中最受欢迎的数据分析库之一,它提供了高效且易于使用的数据结构,如 DataFrame 和 Series,以及大量的数据分析功能。然而,随着数据集规模的增大,单机上的 Pandas 开始显现出性能瓶颈。这时,Dask 就成为了一个很好的解决方案,它能够利用多核 CPU 和多台机器进行分布式计算,从而有效地处理大规模数据集。
610 1
自定义 DataLoader 设计:满足特定需求的实现方案
【8月更文第29天】在深度学习中,数据加载和预处理是训练模型前的重要步骤。PyTorch 提供了 `DataLoader` 类来帮助用户高效地从数据集中加载数据。然而,在某些情况下,标准的 `DataLoader` 无法满足特定的需求,例如处理非结构化数据、进行复杂的预处理操作或是支持特定的数据格式等。这时就需要我们根据自己的需求来自定义 DataLoader。
222 1
AI助理
登录插画

登录以查看您的控制台资源

管理云资源
状态一览
快捷访问

你好,我是AI助理

可以解答问题、推荐解决方案等