Tutorials on training the Skip-thoughts vectors for features extraction of sentence.-阿里云开发者社区

Tutorials on training the Skip-thoughts vectors for features extraction of sentence.

2017-08-02 1816

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： Tutorials on training the Skip-thoughts vectors for features extraction of sentence. 　　1. Send emails and download the training dataset.

Tutorials on training the Skip-thoughts vectors for features extraction of sentence.

　　1. Send emails and download the training dataset.

　　　　the dataset used in skip_thoughts vectors is from [BookCorpus]: http://yknzhu.wixsite.com/mbweb

　　　　first, you should send a email to the auther of this paper and ask for the link of this dataset. Then you will download the following files:

　　　　unzip these files in the current folders.

　　2. Open and download the tensorflow version code. 　　

　　　　Do as the following links: https://github.com/tensorflow/models/tree/master/skip_thoughts

　　　　Then, you will see the processing as follows:

　　　　[Attention] when you install the bazel, you need to install this software, but do not update it. Or, it may shown you some errors in the following operations.

　　3. Encoding Sentences :

　　　　(1). First, open a terminal and input "ipython" :

　　　　(2). input the following code to the terminal:

ipython  # Launch iPython.

In [0]:

# Imports.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import os.path
import scipy.spatial.distance as sd
from skip_thoughts import configuration
from skip_thoughts import encoder_manager

In [1]:
# Set paths to the model.
VOCAB_FILE = "/path/to/vocab.txt"
EMBEDDING_MATRIX_FILE = "/path/to/embeddings.npy"
CHECKPOINT_PATH = "/path/to/model.ckpt-9999"
# The following directory should contain files rt-polarity.neg and
# rt-polarity.pos.

　　　　For this moment, you already defined the environment, then, you need also do the followings:

In [2]:
# Set up the encoder. Here we are using a single unidirectional model.
# To use a bidirectional model as well, call load_model() again with
# configuration.model_config(bidirectional_encoder=True) and paths to the
# bidirectional model's files. The encoder will use the concatenation of
# all loaded models.
encoder = encoder_manager.EncoderManager()
encoder.load_model(configuration.model_config(),
                   vocabulary_file=VOCAB_FILE,
                   embedding_matrix_file=EMBEDDING_MATRIX_FILE,
                   checkpoint_path=CHECKPOINT_PATH)

In [3]:
# Load the movie review dataset.
data = [' This is my first attempt  to the tensorflow version skip_thought_vectors ... ']

　　The, it's time to get the 2400# features now.

In [4]:
# Generate Skip-Thought Vectors for each sentence in the dataset.
encodings = encoder.encode(data)
print(encodings)
print(encodings[0])

　　You can see the results of the algorithm as followings:

　　Now that, you have obtain the features of the input sentence. you can now load your texts to obtain the results. Come on ...

Tutorials on training the Skip-thoughts vectors for features extraction of sentence.

热门文章

最新文章

相关电子书

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Tutorials on training the Skip-thoughts vectors for features extraction of sentence.

热门文章

最新文章

相关电子书