tensorflow中文分词游玩-阿里云开发者社区

tensorflow中文分词游玩

2022-10-25 392

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

NLP 自学习平台，3个模型定制额度 1个月

NLP自然语言处理_高级版，每接口累计50万次

NLP自然语言处理_基础版，每接口每天50万次

简介： tensorflow人工智能简单玩玩

安装tensorflow

pip3 install --upgrade tensorflow
#出现报错，分析环境缺少c++，需要安装gcc-c++
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-xi2jayjc/grpcio/setup.py", line 263, in <module>
        if check_linker_need_libatomic():
      File "/tmp/pip-build-xi2jayjc/grpcio/setup.py", line 213, in check_linker_need_libatomic
        stderr=PIPE)
      File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
        restore_signals, start_new_session)
      File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child
        raise child_exception_type(errno_num, err_msg, err_filename)
    FileNotFoundError: [Errno 2] No such file or directory: 'c++': 'c++'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-xi2jayjc/grpcio/

执行c++安装

yum install gcc-c++ -y
pip3 install --upgrade tensorflow

安装modelscope

安装时间有点长，网络环境不太好，很容易出现中断并且等待时间较长，并且需要装python-devel，否则报错

pip3 install "modelscope[nlp]" -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

测试中文分词

from modelscope.models import Model
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
from modelscope.preprocessors import TokenClassificationPreprocessor

pipeline_ins = pipeline(task=Tasks.word_segmentation)
result = pipeline_ins(input="今天天气不错，适合出去游玩")
print (result)
{'output': '今天 天气 不错 ， 适合 出去 游玩'}

model_id = 'damo/nlp_structbert_word-segmentation_chinese-base'
model = Model.from_pretrained(model_id)
tokenizer = TokenClassificationPreprocessor(model.model_dir)
pipeline_ins = pipeline(task=Tasks.word_segmentation, model=model, preprocessor=tokenizer)
result = pipeline_ins(input="今天天气不错，适合出去游玩")
print (result)
{'output': '今天 天气 不错 ， 适合 出去 游玩'}

输出结果

[root@centos_t ~]# python3.9 tensor_t.py 
2022-09-13 13:33:03,266 - modelscope - INFO - PyTorch version 1.12.1 Found.
2022-09-13 13:33:03,272 - modelscope - INFO - TensorFlow version 2.10.0 Found.
2022-09-13 13:33:03,272 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2022-09-13 13:33:03,289 - modelscope - INFO - Loading done! Current index file version is 0.3.7, with md5 bd11637bf57887f415065ac194005c5b
2022-09-13 13:33:05.681480: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-13 13:33:06.028640: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-09-13 13:33:06.028713: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-09-13 13:33:06.094086: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-09-13 13:33:07.366202: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-09-13 13:33:07.366477: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-09-13 13:33:07.366516: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2022-09-13 13:33:10,799 - modelscope - INFO - File README.md already in cache, skip downloading!
2022-09-13 13:33:10,799 - modelscope - INFO - File config.json already in cache, skip downloading!
2022-09-13 13:33:10,799 - modelscope - INFO - File configuration.json already in cache, skip downloading!
2022-09-13 13:33:10,799 - modelscope - INFO - File pytorch_model.bin already in cache, skip downloading!
2022-09-13 13:33:10,799 - modelscope - INFO - File cws_model.png already in cache, skip downloading!
2022-09-13 13:33:10,799 - modelscope - INFO - File vocab.txt already in cache, skip downloading!
2022-09-13 13:33:10,806 - modelscope - INFO - initialize model from /root/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'BertTokenizer'. 
The class this function is called from is 'SbertTokenizer'.
/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py:713: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
  warnings.warn(
{'output': '今天 天气 不错 ， 适合 出去 游玩'}
2022-09-13 13:33:18,018 - modelscope - INFO - File README.md already in cache, skip downloading!
2022-09-13 13:33:18,018 - modelscope - INFO - File config.json already in cache, skip downloading!
2022-09-13 13:33:18,018 - modelscope - INFO - File configuration.json already in cache, skip downloading!
2022-09-13 13:33:18,019 - modelscope - INFO - File pytorch_model.bin already in cache, skip downloading!
2022-09-13 13:33:18,019 - modelscope - INFO - File cws_model.png already in cache, skip downloading!
2022-09-13 13:33:18,019 - modelscope - INFO - File vocab.txt already in cache, skip downloading!
2022-09-13 13:33:18,019 - modelscope - INFO - initialize model from /root/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'BertTokenizer'. 
The class this function is called from is 'SbertTokenizer'.
{'output': '今天 天气 不错 ， 适合 出去 游玩'}

tensorflow中文分词游玩

安装tensorflow

安装modelscope

测试中文分词

热门文章

最新文章

相关课程

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

tensorflow中文分词游玩

安装tensorflow

安装modelscope

测试中文分词

热门文章

最新文章

相关课程

相关电子书