报错信息如下: 2023-06-08:11:43:32 INFO [se_processor.py:50] [SpeakerEmbeddingProcessor] try load it as se.model Traceback (most recent call last): File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/kantts/preprocess/se_processor/se_processor.py", line 41, in process '[SpeakerEmbeddingProcessor] se model loading error!!!') Exception: [SpeakerEmbeddingProcessor] se model loading error!!!
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "PTTS-basemodel.py", line 33, in trainer.train() File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/modelscope/trainers/audio/tts_trainer.py", line 229, in train self.prepare_data() File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/modelscope/trainers/audio/tts_trainer.py", line 208, in prepare_data se_model) File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/modelscope/preprocessors/tts.py", line 37, in call speaker_name, target_lang, skip_script, se_model) File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/modelscope/preprocessors/tts.py", line 57, in do_data_process targetLang, skip_script, se_model) File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/kantts/preprocess/data_process.py", line 205, in process_data se_model, File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/kantts/preprocess/se_processor/se_processor.py", line 52, in process map_location=device)) File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/torch/serialization.py", line 795, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/home/ducheng/anaconda3/envs/modelscope-sambert-py37/lib/python3.7/site-packages/torch/serialization.py", line 1002, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '\x08'.
[SpeakerEmbeddingProcessor] se model loading error!!!”,这表明在加载说话者嵌入模型时出现了错误。
建议您检查一下您的模型文件路径是否正确,并确保您已经下载了正确的模型文件。您可以尝试使用以下代码加载说话者嵌入模型:
python
Copy
from kantts.preprocess.se_processor.se_processor import SpeakerEmbeddingProcessor
se_processor = SpeakerEmbeddingProcessor(model_path="path/to/se/model")
在这个示例中,我们使用 SpeakerEmbeddingProcessor 类加载说话者嵌入模型,并将模型文件路径传递给 model_path 参数。
提供能在其他服务器ubuntu环境下跑通的脚本,亲测有效
环境如下:
Ubuntu 20.04 + Python3.8
NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1
#!/bin/bash
# 设置显存分片大小,防止OOM爆显存
cat>/etc/profile.d/proxy.sh<<EOF
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:32
EOF
# 更新以及安装必须的软件
apt update
apt upgrade -y
apt list --upgradable -a
apt-get install libsndfile1 sox nano wget curl git zip -y
apt autoclean -y
apt autoremove -y
# 登录时使能设置环境变量
source /etc/profile.d/proxy.sh
# 克隆官方基础库,魔法自备
git clone https://github.com/modelscope/modelscope.git
# 安装Audio所必须的包,亲测有效
cd modelscope
python -m pip install --upgrade pip
pip install -r requirements/tests.txt
pip install -r requirements/framework.txt -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
pip install -r requirements/audio.txt -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
pip install -r requirements/nlp.txt -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
pip install .
pip install tts-autolabel kantts==0.0.1 -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
pip install typeguard==2.13.3 pydantic==1.10.10 numpy==1.21.6 -y
pip uninstall funasr -y
# 下载nltk包到根目录
cd ~
wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/TTS/download_files/nltk_data.zip
unzip nltk_data.zip
# 接下来就可以按照描述中的步骤进行体验
# 接下来用这个python脚本运行测试一下环境是否ok,如果没报错就应该是ok了
from modelscope.tools import run_auto_label
from modelscope.metainfo import Trainers
from modelscope.trainers import build_trainer
from modelscope.utils.audio.audio_utils import TtsTrainType
import os
from modelscope.models.audio.tts import SambertHifigan
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import torch
print(torch.__version__)
print(torch.cuda.is_available())
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
text = '待合成文本'
model_id = 'damo/speech_sambert-hifigan_tts_zh-cn_16k'
sambert_hifigan_tts = pipeline(task=Tasks.text_to_speech, model=model_id)
output = sambert_hifigan_tts(input=text, voice='zhitian_emo')
wav = output[OutputKeys.OUTPUT_WAV]
with open('output.wav', 'wb') as f:
f.write(wav)
kwargs = dict( model=pretrained_model_id, # 指定要finetune的模型 model_revision = "v1.0.4", # 就是这里,只有改成1.0.4才顺利通过 work_dir=pretrain_work_dir, # 指定临时工作目录 train_dataset=dataset_id, # 指定数据集id train_type=train_info # 指定要训练类型及参数 )