开发者社区 > ModelScope模型即服务 > 自然语言处理 > 正文

modelscope-funasrbash的finetune.sh命令下载训练模型显示错误,怎么办?

modelscope-funasrbash的 finetune.sh命令下载训练模型显示错误,如何解决?speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch ,:

Downloading: 76%|███████▌ | 640M/840M [19:27<06:04, 575kB/s]

Downloading: 38%|███▊ | 320M/840M [19:27<31:36, 287kB/s] ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 4672) of binary: /home/dcb/anaconda3/envs/funasr/bin/python Traceback (most recent call last): File "/home/dcb/anaconda3/envs/funasr/bin/torchrun", line 8, in
sys.exit(main())
File "/home/dcb/anaconda3/envs/funasr/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 345, in wrapper
return f(args, *kwargs)
File "/home/dcb/anaconda3/envs/funasr/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/home/dcb/anaconda3/envs/funasr/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/home/dcb/anaconda3/envs/funasr/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/dcb/anaconda3/envs/funasr/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

../../../funasr/bin/train.py FAILED

Failures:
[1]:
time : 2024-05-11_11:29:56
host : dcb-Legion-Y9000P-IRX8
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 4673)
error_file:

traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2024-05-11_11:29:56
host : dcb-Legion-Y9000P-IRX8
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 4672)
error_file:

traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

展开
收起
三分钟热度的鱼 2024-05-16 08:44:42 31 0
1 条回答
写回答
取消 提交回答
  • 如果第一次运行,模型还没有下载,先用单gpu运行,测试没问题后,再多gpu运行,如果直接多gpu运行,每个gpu都去去modelscope上下载,会导致下载冲突了。此回答整理自钉群“modelscope-funasr社区交流”

    2024-05-16 11:02:36
    赞同 1 展开评论 打赏

包含命名实体识别、文本分类、分词、关系抽取、问答、推理、文本摘要、情感分析、机器翻译等多个领域

热门讨论

热门文章

相关电子书

更多
视觉AI能力的开放现状及ModelScope实战 立即下载
ModelScope助力语音AI模型创新与应用 立即下载
低代码开发师(初级)实战教程 立即下载