textblob 使用中问题

2017-02-17 4337

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： （1）找不到数据文件错误 Errors more Resource u'tokenizers/punkt/english.pickle' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in:

（1）找不到数据文件错误

Errors more 
Resource u'tokenizers/punkt/english.pickle' not found.  Please
    use the NLTK Downloader to obtain the resource:  >>>
    nltk.download()
    Searched in:
    - '/var/www/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - u'



Traceback (most recent call last):
  File "/var/www/CSCE-470-Anime-Recommender/py/app.py", line 40, in <module>
    cl = NaiveBayesClassifier(Functions.classify(UserData))
  File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 192, in __init__
    self.train_features = [(self.extract_features(d), c) for d, c in self.train_set]
  File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 169, in extract_features
    return self.feature_extractor(text, self.train_set)
  File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 81, in basic_extractor
    word_features = _get_words_from_dataset(train_set)
  File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 63, in _get_words_from_dataset
    return set(all_words)
  File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 62, in <genexpr>
    all_words = chain.from_iterable(tokenize(words) for words, _ in dataset)
  File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 59, in tokenize
    return word_tokenize(words, include_punc=False)
  File "/usr/local/lib/python2.7/dist-packages/textblob/tokenizers.py", line 72, in word_tokenize
    for sentence in sent_tokenize(text))
  File "/usr/local/lib/python2.7/dist-packages/textblob/base.py", line 64, in itokenize
    return (t for t in self.tokenize(text, *args, **kwargs))
  File "/usr/local/lib/python2.7/dist-packages/textblob/decorators.py", line 38, in decorated
    raise MissingCorpusError()
MissingCorpusError: 
Looks like you are missing some required data for this feature.

To download the necessary data, simply run

    python -m textblob.download_corpora

or use the NLTK downloader to download the missing data: http://nltk.org/data.html
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.

我本地没有taggers/averaged_perceptron_tagger/averaged_perceptron_tagger.pickle这个文件，打开本地nltk_data,发现还真是，只有下载了

解决方法：使用nltk下载

nltk.download()

下载过程中会有个弹窗，要自己选择下载的文件，在Models里第一个averaged_perceptron_tagger,然后点击下载，如果网络环境比较好的话，很快就可以下载完成了。

（2）翻译问题

textblob 的翻译程序在 /usr/lib/python2.7/site-packages/textblob/translate.py

他主要是使用了google的翻译，代码中的链接为

url = "http://translate.google.com/translate_a/t"

所以，国内是访问不料这个网址的，所以就翻译不了

文章标签：

Python

textblob 使用中问题

（1）找不到数据文件错误

（2）翻译问题

热门文章

最新文章

相关电子书

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

textblob 使用中问题

（1）找不到数据文件错误

（2）翻译问题

热门文章

最新文章

相关电子书