使用SambertHifigan个性化语音合成-中文-预训练-16k报错,完全按照模型介绍中的操作
提供能在其他服务器ubuntu环境下跑通的脚本,亲测有效
环境如下:
Ubuntu 20.04 + Python3.8
NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1
#!/bin/bash
# 设置显存分片大小,防止OOM爆显存
cat>/etc/profile.d/proxy.sh
# 接下来用这个python脚本运行测试一下环境是否ok,如果没报错就应该是ok了
from modelscope.tools import run_auto_label
from modelscope.metainfo import Trainers
from modelscope.trainers import build_trainer
from modelscope.utils.audio.audio_utils import TtsTrainType
import os
from modelscope.models.audio.tts import SambertHifigan
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import torch
print(torch.__version__)
print(torch.cuda.is_available())
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
text = '待合成文本'
model_id = 'damo/speech_sambert-hifigan_tts_zh-cn_16k'
sambert_hifigan_tts = pipeline(task=Tasks.text_to_speech, model=model_id)
output = sambert_hifigan_tts(input=text, voice='zhitian_emo')
wav = output[OutputKeys.OUTPUT_WAV]
with open('output.wav', 'wb') as f:
f.write(wav)
最后需要特别注意的是,预训练model_revision = 'v1.0.4'才可以
kwargs = dict( model=pretrained_model_id, # 指定要finetune的模型 model_revision = 'v1.0.4', # 就是这里,只有改成1.0.4才顺利通过 work_dir=pretrain_work_dir, # 指定临时工作目录 train_dataset=dataset_id, # 指定数据集id train_type=train_info # 指定要训练类型及参数 )
赞1
踩0