云端开炉,线上训练,Bert-vits2-v2.2云端线上训练和推理实践(基于GoogleColab)

简介: 对于笔者这样的穷哥们来讲,GoogleColab就是黑暗中的一道光,就算有训练时长限制,也能凑合用了,要啥自行车?要饭咱也就别嫌饭馊了,本次我们基于GoogleColab在云端训练和推理Bert-vits2-v2.2项目,复刻那黑破坏神角色莉莉丝(lilith)。

google.jpg

假如我们一定要说深度学习入门会有一定的门槛,那么设备成本是一个无法避开的话题。深度学习模型通常需要大量的计算资源来进行训练和推理。较大规模的深度学习模型和复杂的数据集需要更高的计算能力才能进行有效的训练。因此,训练深度学习模型可能需要使用高性能的计算设备,如图形处理器(GPU)或专用的深度学习处理器(如TPU),这让很多本地没有N卡的同学望而却步。

GoogleColab是由Google提供的一种基于云的免费Jupyter笔记本环境。它可以帮助入门用户轻松地进行机器学习和深度学习的实验。

尽管GoogleColab提供了很多便利和免费的功能,但也有一些限制。例如,每个会话的计算资源可能是有限的,并且会话可能会在一段时间后自动关闭。此外,Colab的使用可能受到Google的限制和政策规定。

对于笔者这样的穷哥们来讲,GoogleColab就是黑暗中的一道光,就算有训练时长限制,也能凑合用了,要啥自行车?要饭咱也就别嫌饭馊了,本次我们基于GoogleColab在云端训练和推理Bert-vits2-v2.2项目,复刻那黑破坏神角色莉莉丝(lilith)。

配置云端设备

首先进入GoogleColab实验室官网:

https://colab.research.google.com/

点击新建笔记,并且链接设备服务器:

这里硬件设备选择T4GPU。

随后新建一条命令

#@title 查看显卡  
!nvidia-smi

点击运行程序返回:




Tue Dec 19 03:07:21 2023         
+---------------------------------------------------------------------------------------+  
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |  
|-----------------------------------------+----------------------+----------------------+  
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |  
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |  
|                                         |                      |               MIG M. |  
|=========================================+======================+======================|  
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |  
| N/A   54C    P8              10W /  70W |      0MiB / 15360MiB |      0%      Default |  
|                                         |                      |                  N/A |  
+-----------------------------------------+----------------------+----------------------+  

+---------------------------------------------------------------------------------------+  
| Processes:                                                                            |  
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |  
|        ID   ID                                                             Usage      |  
|=======================================================================================|  
|  No running processes found                                                           |  
+---------------------------------------------------------------------------------------+

新一代图灵架构、16GB 显存,免费 GPU 也能如此耀眼,不愧是业界良心。

克隆代码仓库

随后新建命令:



#@title 克隆代码仓库  
!git clone https://github.com/v3ucn/Bert-vits2-V2.2.git

程序返回:

Cloning into 'Bert-vits2-V2.2'...  
remote: Enumerating objects: 310, done.  
remote: Counting objects: 100% (310/310), done.  
remote: Compressing objects: 100% (210/210), done.  
remote: Total 310 (delta 97), reused 294 (delta 81), pack-reused 0  
Receiving objects: 100% (310/310), 12.84 MiB | 18.95 MiB/s, done.  
Resolving deltas: 100% (97/97), done.

安装所需要的依赖

新建安装依赖命令:

#@title 安装所需要的依赖  
%cd /content/Bert-vits2-V2.2  
!pip install -r requirements.txt

依赖安装的时间要长一些,需要耐心等待。

下载必要的模型

接着下载必要的模型,这里包括bert模型和情感模型:

#@title 下载必要的模型  
!wget -P emotional/clap-htsat-fused/ https://huggingface.co/laion/clap-htsat-fused/resolve/main/pytorch_model.bin  
!wget -P emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim/ https://huggingface.co/audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim/resolve/main/pytorch_model.bin  
!wget -P bert/chinese-roberta-wwm-ext-large/ https://huggingface.co/hfl/chinese-roberta-wwm-ext-large/resolve/main/pytorch_model.bin  
!wget -P bert/bert-base-japanese-v3/ https://huggingface.co/cl-tohoku/bert-base-japanese-v3/resolve/main/pytorch_model.bin  
!wget -P bert/deberta-v3-large/ https://huggingface.co/microsoft/deberta-v3-large/resolve/main/pytorch_model.bin  
!wget -P bert/deberta-v3-large/ https://huggingface.co/microsoft/deberta-v3-large/resolve/main/pytorch_model.generator.bin  
!wget -P bert/deberta-v2-large-japanese/ https://huggingface.co/ku-nlp/deberta-v2-large-japanese/resolve/main/pytorch_model.bin

如果推理任务只需要中文语音,那么下载前三个模型即可。

下载底模文件

随后下载Bert-vits2-v2.2底模:

#@title 下载底模文件  

!wget -P Data/lilith/models/ https://huggingface.co/OedoSoldier/Bert-VITS2-2.2-CLAP/resolve/main/DUR_0.pth  
!wget -P Data/lilith/models/ https://huggingface.co/OedoSoldier/Bert-VITS2-2.2-CLAP/resolve/main/D_0.pth  
!wget -P Data/lilith/models/ https://huggingface.co/OedoSoldier/Bert-VITS2-2.2-CLAP/resolve/main/G_0.pth

注意这里的底模要放在角色的models目录中,同时注意底模版本是2.2。

上传音频素材和重采样

随后打开目录,在lilith目录右键新建文件夹raw,接着右键点击上传,将素材上传到云端:

同时也将转写文件esd.list右键上传到项目的lilith目录:

./Data/lilith/wavs/processed_0.wav|lilith|ZH|信仰,叫你们要否定心中的欲望。  
./Data/lilith/wavs/processed_1.wav|lilith|ZH|把你們囚禁在自己的身體裡  
./Data/lilith/wavs/processed_2.wav|lilith|ZH|圣修雅瑞之母  
./Data/lilith/wavs/processed_3.wav|lilith|ZH|我有你要的东西  
./Data/lilith/wavs/processed_4.wav|lilith|ZH|你渴望知识  
./Data/lilith/wavs/processed_5.wav|lilith|ZH|不惜带着孩子寻遍圣修雅瑞  
./Data/lilith/wavs/processed_6.wav|lilith|ZH|这话你真的相信吗  
./Data/lilith/wavs/processed_7.wav|lilith|ZH|不必再裝了  
./Data/lilith/wavs/processed_8.wav|lilith|ZH|你有問題,我有答案  
./Data/lilith/wavs/processed_9.wav|lilith|ZH|我洞悉整個宇宙的真理  
./Data/lilith/wavs/processed_10.wav|lilith|ZH|你看了那么多  
./Data/lilith/wavs/processed_11.wav|lilith|ZH|知道的卻那麼少  
./Data/lilith/wavs/processed_12.wav|lilith|ZH|打碎枷鎖  
./Data/lilith/wavs/processed_13.wav|lilith|ZH|你願意接受我的提議嗎?  
./Data/lilith/wavs/processed_14.wav|lilith|ZH|你很好奇想知道我  
./Data/lilith/wavs/processed_15.wav|lilith|ZH|為什麼饒了你的命  
./Data/lilith/wavs/processed_16.wav|lilith|ZH|你相信我吗  
./Data/lilith/wavs/processed_17.wav|lilith|ZH|很好,现在你只需要知道。  
./Data/lilith/wavs/processed_18.wav|lilith|ZH|我們要去見我兒子  
./Data/lilith/wavs/processed_19.wav|lilith|ZH|是的,但不止如此。  
./Data/lilith/wavs/processed_20.wav|lilith|ZH|他还是我计划的关键  
./Data/lilith/wavs/processed_21.wav|lilith|ZH|雖然我無法預料  
./Data/lilith/wavs/processed_22.wav|lilith|ZH|在新世界里你是否愿意站在我身边  
./Data/lilith/wavs/processed_23.wav|lilith|ZH|找出自己真正的本性  
./Data/lilith/wavs/processed_25.wav|lilith|ZH|可是我還是會為你  
./Data/lilith/wavs/processed_27.wav|lilith|ZH|但現在所有的可能性  
./Data/lilith/wavs/processed_28.wav|lilith|ZH|统统被夺走了  
./Data/lilith/wavs/processed_29.wav|lilith|ZH|奪走啊  
./Data/lilith/wavs/processed_30.wav|lilith|ZH|這把鑰匙能打開的不僅是地獄的大門  
./Data/lilith/wavs/processed_31.wav|lilith|ZH|也会开启我们的未来  
./Data/lilith/wavs/processed_32.wav|lilith|ZH|因為你的犧牲才得以實現到未來  
./Data/lilith/wavs/processed_33.wav|lilith|ZH|打碎枷鎖  
./Data/lilith/wavs/processed_34.wav|lilith|ZH|接受美麗的罪惡  
./Data/lilith/wavs/processed_35.wav|lilith|ZH|这就是第一批  
./Data/lilith/wavs/processed_36.wav|lilith|ZH|腦筋動得很快  
./Data/lilith/wavs/processed_37.wav|lilith|ZH|没错,我正是莉莉丝。

至于音频如何切分、转写、标注等操作,请移步:本地训练,立等可取,30秒音频素材复刻霉霉讲中文音色基于Bert-VITS2V2.0.2。囿于篇幅,这里不再赘述。

确保素材切分和转写文件都上传成功后,新建命令:

#@title 重采样  
!python3 resample.py --sr 44100 --in_dir ./Data/lilith/raw/ --out_dir ./Data/lilith/wavs/

进行重采样操作。

预处理标签文件

接着新建命令:

#@title 预处理标签文件  
!python3 preprocess_text.py --transcription-path ./Data/lilith/esd.list --train-path ./Data/lilith/train.list --val-path ./Data/lilith/val.list --config-path ./Data/lilith/configs/config.json

程序返回:

pytorch_model.bin: 100% 1.32G/1.32G [00:26<00:00, 49.4MB/s]  
spm.model: 100% 2.46M/2.46M [00:00<00:00, 131MB/s]  
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.  
0it [00:00, ?it/s]  
[nltk_data] Downloading package averaged_perceptron_tagger to  
[nltk_data]     /root/nltk_data...  
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.  
[nltk_data] Downloading package cmudict to /root/nltk_data...  
[nltk_data]   Unzipping corpora/cmudict.zip.  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Some weights of EmotionModel were not initialized from the model checkpoint at ./emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0']  
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.  
  0% 0/36 [00:00<?, ?it/s]Building prefix dict from the default dictionary ...  
Dumping model to file cache /tmp/jieba.cache  
Loading model cost 0.686 seconds.  
Prefix dict has been built successfully.  
100% 36/36 [00:00<00:00, 40.28it/s]  
总重复音频数:0,总未找到的音频数:0  
训练集和验证集生成完成!

此时,在lilith目录已经生成训练集和验证集,即train.list和val.list。

生成 BERT 特征文件

接着新建命令:

#@title 生成 BERT 特征文件  
!python3 bert_gen.py --config-path ./Data/lilith/configs/config.json

程序返回:

0% 0/36 [00:00<?, ?it/s]Some weights of the model checkpoint at ./bert/chinese-roberta-wwm-ext-large were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']  
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).  
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).  
Some weights of the model checkpoint at ./bert/chinese-roberta-wwm-ext-large were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']  
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).  
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).  
100% 36/36 [00:21<00:00,  1.67it/s]  
bert生成完毕!, 共有36个bert.pt生成!

数一下,一共36个,和音频素材数量一致。

生成 clap 特征文件

最后生成clap情感特征文件:

#@title 生成 clap 特征文件  
#!wget -P emotional/clap-htsat-fused/ https://huggingface.co/laion/clap-htsat-fused/resolve/main/pytorch_model.bin  
!python3 clap_gen.py --config-path ./Data/lilith/configs/config.json

程序返回:

/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
  0% 0/36 [00:00<?, ?it/s]/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
100% 36/36 [00:44<00:00,  1.23s/it]  
clap生成完毕!, 共有36个emo.pt生成!

同样36个,也就是说每个素材需要对应一个bert和一个clap。

开始训练

万事俱备,开始训练:

#@title 开始训练  
!python3 train_ms.py

程序返回:

2023-12-19 03:17:48.852966: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered  
2023-12-19 03:17:48.853057: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered  
2023-12-19 03:17:48.992178: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered  
2023-12-19 03:17:49.268092: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.  
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.  
2023-12-19 03:17:51.369993: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT  
加载config中的配置localhost  
加载config中的配置10086  
加载config中的配置1  
加载config中的配置0  
加载config中的配置0  
加载环境变量   
MASTER_ADDR: localhost,  
MASTER_PORT: 10086,  
WORLD_SIZE: 1,  
RANK: 0,  
LOCAL_RANK: 0  
12-19 03:17:55 INFO     | data_utils.py:66 | Init dataset...  
100% 32/32 [00:00<00:00, 51901.67it/s]  
12-19 03:17:55 INFO     | data_utils.py:81 | skipped: 0, total: 32  
12-19 03:17:55 INFO     | data_utils.py:66 | Init dataset...  
100% 4/4 [00:00<00:00, 34100.03it/s]  
12-19 03:17:55 INFO     | data_utils.py:81 | skipped: 0, total: 4  
Using noise scaled MAS for VITS2  
Using duration discriminator for VITS2  
INFO:models:Loaded checkpoint 'Data/lilith/models/DUR_0.pth' (iteration 0)  
ERROR:models:emb_g.weight is not in the checkpoint  
INFO:models:Loaded checkpoint 'Data/lilith/models/G_0.pth' (iteration 0)  
INFO:models:Loaded checkpoint 'Data/lilith/models/D_0.pth' (iteration 0)  
******************检测到模型存在,epoch为 1,gloabl step为 0*********************  
  0% 0/8 [00:00<?, ?it/s][W reducer.cpp:1346] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())  
INFO:models:Train Epoch: 1 [0%]  
INFO:models:[2.78941011428833, 2.49017596244812, 5.66870641708374, 25.731149673461914, 4.624840259552002, 3.6382224559783936, 0, 0.0002]  
Evaluating ...  
INFO:models:Saving model and optimizer state at iteration 1 to Data/lilith/models/G_0.pth  
INFO:models:Saving model and optimizer state at iteration 1 to Data/lilith/models/D_0.pth  
INFO:models:Saving model and optimizer state at iteration 1 to Data/lilith/models/DUR_0.pth  
100% 8/8 [00:40<00:00,  5.05s/it]  
INFO:models:====> Epoch: 1  
100% 8/8 [00:09<00:00,  1.20s/it]  
INFO:models:====> Epoch: 2  
100% 8/8 [00:09<00:00,  1.23s/it]  
INFO:models:====> Epoch: 3  
100% 8/8 [00:09<00:00,  1.24s/it]  
INFO:models:====> Epoch: 4  
100% 8/8 [00:09<00:00,  1.25s/it]  
INFO:models:====> Epoch: 5  
100% 8/8 [00:10<00:00,  1.26s/it]  
INFO:models:====> Epoch: 6  
 25% 2/8 [00:02<00:08,  1.41s/it]INFO:models:Train Epoch: 7 [25%]

由此就在底模的基础上开始训练了。

在线推理

训练了100步之后,我们可以先看看效果:

注意修改根目录的config.yml中的模型名称和模型名称一致:

# webui webui配置  
# 注意, “:” 后需要加空格  
webui:  
  # 推理设备  
  device: "cuda"  
  # 模型路径  
  model: "models/G_100.pth"  
  # 配置文件路径  
  config_path: "configs/config.json"  
  # 端口号  
  port: 7860  
  # 是否公开部署,对外网开放  
  share: false  
  # 是否开启debug模式  
  debug: false  
  # 语种识别库,可选langid, fastlid  
  language_identification_library: "langid"

这里model参数写成:models/G_100.pth

随后新建命令:

#@title 开始推理  
!python3 webui.py

程序返回:

Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Some weights of EmotionModel were not initialized from the model checkpoint at ./emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1']  
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.  
| numexpr.utils | INFO | NumExpr defaulting to 2 threads.  
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.  
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")  
| utils | INFO | Loaded checkpoint 'Data/lilith/models/G_100.pth' (iteration 13)  
推理页面已开启!  
Running on local URL:  http://127.0.0.1:7860  
Running on public URL: https://40b8695e0a18b0e2eb.gradio.live

一个内网地址,一个公网地址,访问公网地址https://40b8695e0a18b0e2eb.gradio.live进行推理即可。

最后奉上GoogleColab笔记链接:

https://colab.research.google.com/drive/1LgewU9jevSovP9NTuqTtoxDop3qeWWKK?usp=sharing

与君共觞。

相关实践学习
部署Stable Diffusion玩转AI绘画(GPU云服务器)
本实验通过在ECS上从零开始部署Stable Diffusion来进行AI绘画创作,开启AIGC盲盒。
相关文章
|
6月前
|
机器学习/深度学习 人工智能 开发工具
如何快速部署本地训练的 Bert-VITS2 语音模型到 Hugging Face
Hugging Face是一个机器学习(ML)和数据科学平台和社区,帮助用户构建、部署和训练机器学习模型。它提供基础设施,用于在实时应用中演示、运行和部署人工智能(AI)。用户还可以浏览其他用户上传的模型和数据集。Hugging Face通常被称为机器学习界的GitHub,因为它让开发人员公开分享和测试他们所训练的模型。 本次分享如何快速部署本地训练的 Bert-VITS2 语音模型到 Hugging Face。
如何快速部署本地训练的 Bert-VITS2 语音模型到 Hugging Face
|
11月前
|
机器学习/深度学习 缓存 自然语言处理
义无反顾马督工,Bert-vits2V210复刻马督工实践(Python3.10)
Bert-vits2更新了版本V210,修正了日/英的bert对齐问题,效果进一步优化;对底模使用的数据进行优化和加量,减少finetune失败以及电音的可能性;日语bert更换了模型,完善了多语言推理。
义无反顾马督工,Bert-vits2V210复刻马督工实践(Python3.10)
|
6月前
|
JavaScript
Bert-vits2-v2.2新版本本地训练推理整合包(原神八重神子英文模型miko)
近日,Bert-vits2-v2.2如约更新,该新版本v2.2主要把Emotion 模型换用CLAP多模态模型,推理支持输入text prompt提示词和audio prompt提示语音来进行引导风格化合成,让推理音色更具情感特色,并且推出了新的预处理webuI,操作上更加亲民和接地气。
Bert-vits2-v2.2新版本本地训练推理整合包(原神八重神子英文模型miko)
|
6月前
|
机器学习/深度学习 异构计算 Python
Bert-vits2最终版Bert-vits2-2.3云端训练和推理(Colab免费GPU算力平台)
对于深度学习初学者来说,JupyterNoteBook的脚本运行形式显然更加友好,依托Python语言的跨平台特性,JupyterNoteBook既可以在本地线下环境运行,也可以在线上服务器上运行。GoogleColab作为免费GPU算力平台的执牛耳者,更是让JupyterNoteBook的脚本运行形式如虎添翼。 本次我们利用Bert-vits2的最终版Bert-vits2-v2.3和JupyterNoteBook的脚本来复刻生化危机6的人气角色艾达王(ada wong)。
Bert-vits2最终版Bert-vits2-2.3云端训练和推理(Colab免费GPU算力平台)
|
12月前
|
并行计算 API C++
又欲又撩人,基于新版Bert-vits2V2.0.2音色模型雷电将军八重神子一键推理整合包分享
Bert-vits2项目近期炸裂更新,放出了v2.0.2版本的代码,修正了存在于2.0先前版本的重大bug,并且重炼了底模,本次更新是即1.1.1版本后最重大的更新,支持了三语言训练及混合合成,并且做到向下兼容,可以推理老版本的模型,本次我们基于新版V2.0.2来本地推理原神小姐姐们的音色模型。
又欲又撩人,基于新版Bert-vits2V2.0.2音色模型雷电将军八重神子一键推理整合包分享
|
3月前
|
机器学习/深度学习 存储 自然语言处理
【NLP-新闻文本分类】3 Bert模型的对抗训练
详细介绍了使用BERT模型进行新闻文本分类的过程,包括数据集预处理、使用预处理数据训练BERT语料库、加载语料库和词典后用原始数据训练BERT模型,以及模型测试。
59 1
|
3月前
|
算法 异构计算
自研分布式训练框架EPL问题之帮助加速Bert Large模型的训练如何解决
自研分布式训练框架EPL问题之帮助加速Bert Large模型的训练如何解决
|
3月前
|
数据采集 人工智能 数据挖掘
2021 第五届“达观杯” 基于大规模预训练模型的风险事件标签识别】3 Bert和Nezha方案
2021第五届“达观杯”基于大规模预训练模型的风险事件标签识别比赛中使用的NEZHA和Bert方案,包括预训练、微调、模型融合、TTA测试集数据增强以及总结和反思。
40 0
|
5月前
|
机器学习/深度学习 自然语言处理 数据可视化
BERT-IMDB电影评论情感分类实战:SwanLab可视化训练
这篇文章介绍了使用BERT模型进行IMDB电影评论情感分类的实战教程,涉及SwanLab、transformers和datasets库。作者提供了一键安装库的命令,并详细解释了每个库的作用。文章展示了如何加载BERT模型和IMDB数据集,以及如何利用SwanLab进行可视化训练。训练过程在SwanLab平台上进行,包括模型微调、指标记录和结果可视化。此外,还提供了完整代码、模型与数据集的下载链接,以及相关工具的GitHub仓库地址。
BERT-IMDB电影评论情感分类实战:SwanLab可视化训练
|
6月前
|
人工智能 语音技术
Bert-vits2新版本V2.1英文模型本地训练以及中英文混合推理(mix)
中英文混合输出是文本转语音(TTS)项目中很常见的需求场景,尤其在技术文章或者技术视频领域里,其中文文本中一定会夹杂着海量的英文单词,我们当然不希望AI口播只会念中文,Bert-vits2老版本(2.0以下版本)并不支持英文训练和推理,但更新了底模之后,V2.0以上版本支持了中英文混合推理(mix)模式。
Bert-vits2新版本V2.1英文模型本地训练以及中英文混合推理(mix)

热门文章

最新文章