基于 PPDiffusers 训练DreamBooth LoRA微调生成中国山水画风格【livingbody/Chinese_ShanShui_Style】
本教程将从以下两个方面带领大家熟悉整个流程。
- 1. 准备工作
- 1.1 环境安装
- 1.2 Hugging Face Space 注册和登录
- 2. 如何训练
- 2.1 上传图片
- 2.2 训练参数调整
- 2.3 挑选满意的权重上传至Huggingface
- 2.4 再生成一张
1. 准备工作
1.1 环境安装
在开始之前,我们需要准备我们所需的环境,运行下面的命令安装依赖。为了确保安装成功,安装完毕请重启内核!(注意:这里只需要运行一次!)
pip install "paddlenlp>=2.5.2" "ppdiffusers>=0.11.1" safetensors --user
# 请运行这里安装所需要的依赖环境!! !pip install "paddlenlp>=2.5.2" safetensors "ppdiffusers>=0.11.1" --user from IPython.display import clear_output clear_output() # 清理很长的内容
1.2 Hugging Face Space 注册和登录
题目要求将模型上传到 Hugging Face,需要先注册、登录。
- 注册和登录:huggingface.co/join
- 获取登录 Token
- Aistudio 登录 Huggingface Hub
Tips:为了方便我们之后上传权重,我们需要登录 Huggingface Hub,想要了解更多的信息我们可以查阅 官方文档。
!git config --global credential.helper store from huggingface_hub import login login()
VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…
2. 如何训练模型,并上传到HF
数据集使用的是中国山水画
2.1 上传图片
# 解压缩数据集 !unzip -qoa data/data107231/Chinese_art_dataset.zip -d Chinese_art_dataset
!cp Chinese_art_dataset/Chinese_art_dataset/style_images/shanshui* train_dataset/
2.2 训练参数调整
在训练过程中,我们可以尝试修改训练的默认参数,下面将从三个方面介绍部分参数。
👉主要修改的参数:
- pretrained_model_name_or_path :想要训练的模型名称或者本地路径的模型,例如:
"runwayml/stable-diffusion-v1-5"
,更多模型可参考 PaddleNLP 文档。- instance_data_dir:训练图片所在的文件夹目录,我们可以将图片上传至aistudio项目。
- instance_prompt:训练所使用的
Prompt
文本。- resolution:训练时图像的分辨率,建议为
512
。- output_dir:训练过程中,模型保存的目录。
- checkpointing_steps:每隔多少步保存模型,默认为
100
步。- learning_rate:训练使用的学习率,当我使用
LoRA
训练模型的时候,我们需要使用更大的学习率,因此我们这里使用1e-4
而不是2e-6
。- max_train_steps:最大训练的步数,默认为
500
步。
👉可选修改的参数:
- train_batch_size:训练时候使用的
batch_size
,当我们的GPU显存比较大的时候可以加大这个值,默认值为4
。- gradient_accumulation_steps:梯度累积的步数,当我们GPU显存比较小的时候还想模拟大的训练批次,我们可以适当增加梯度累积的步数,默认值为
1
。- seed:随机种子,设置后可以复现训练结果。
- lora_rank:
LoRA
层的rank
值,默认值为4,最终我们会得到 3.5MB 的模型,我们可以适当修改这个值,如:32、64、128、256
等。- lr_scheduler:学习率衰减策略,可以是
"linear", "constant", "cosine"
等。- lr_warmup_steps:学习率衰减前,
warmup
到最大学习率所需要的步数。
👉训练过程中评估使用的参数:
- num_validation_images:训练的过程中,我们希望返回多少张图片,默认值为
4
张图片。- validation_prompt:训练的过程中我们会评估训练的怎么样,因此我们需要设置评估使用的
prompt
文本。- validation_steps:每隔多少个
steps
评估模型,我们可以查看训练的进度条,知道当前到了第几个steps
。
🔥Tips: 训练过程中会每隔 validation_steps
将生成的图片保存到 {你指定的输出路径}/validation_images/{步数}.jpg
👉权重上传的参数:
- push_to_hub: 是否将模型上传到
huggingface hub
,默认值为False
。- hub_token: 上传到
huggingface hub
所需要使用的token
,如果我们已经登录了,那么我们就无需填写。- hub_model_id: 上传到
huggingface hub
的模型库名称, 如果为None
的话表示我们将使用output_dir
的名称作为模型库名称。
在下面的例子中,由于我们前面已经登录了,因此我们可以开启 push_to_hub 按钮,将最终训练好的模型同步上传到 huggingface.co
当我们开启push_to_hub
后,等待程序运行完毕后会自动将权重上传到这个路径 huggingface.co/{你的用户名}/{你指… ,例如: huggingface.co/junnyu/lora…
!python train_dreambooth_lora.py \ --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \ --instance_data_dir="train_dataset" \ --output_dir="lora_outputs" \ --instance_prompt="Chinese_ShanShui_Style" \ --resolution=512 \ --train_batch_size=2 \ --gradient_accumulation_steps=1 \ --checkpointing_steps=100 \ --learning_rate=1e-4 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --max_train_steps=800 \ --seed=0 \ --lora_rank=4 \ --push_to_hub=False \ --validation_prompt="A little black cat is playing in the woods with Chinese_ShanShui_Style" \ --validation_steps=100 \ --num_validation_images=4
W0323 16:10:06.002939 5675 gpu_resources.cc:85] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2 W0323 16:10:06.007860 5675 gpu_resources.cc:115] device: 0, cuDNN Version: 8.2. 正在下载模型权重,请耐心等待。。。。。。。。。。 [33m[2023-03-23 16:10:08,262] [ WARNING][0m - You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.[0m Train Steps: 12%|█ | 100/800 [00:57<06:29, 1.80it/s, epoch=0016, lr=0.0001, step_loss=0.261] Saved lora weights to lora_outputs/checkpoint-100 100%|███████████████████████████████████████████| 601/601 [00:00<00:00, 171kB/s][A 100%|███████████████████████████████████████████| 342/342 [00:00<00:00, 113kB/s][A Train Steps: 25%|█▊ | 200/800 [02:16<05:41, 1.76it/s, epoch=0033, lr=0.0001, step_loss=0.0311] Saved lora weights to lora_outputs/checkpoint-200 Train Steps: 38%|███ | 300/800 [03:35<04:38, 1.80it/s, epoch=0049, lr=0.0001, step_loss=0.113] Saved lora weights to lora_outputs/checkpoint-300 Train Steps: 50%|████ | 400/800 [04:53<03:44, 1.78it/s, epoch=0066, lr=0.0001, step_loss=0.118] Saved lora weights to lora_outputs/checkpoint-400 Train Steps: 62%|█████ | 500/800 [06:11<02:50, 1.76it/s, epoch=0083, lr=0.0001, step_loss=0.167] Saved lora weights to lora_outputs/checkpoint-500 Train Steps: 75%|██████▊ | 600/800 [07:30<01:52, 1.78it/s, epoch=0099, lr=0.0001, step_loss=0.11] Saved lora weights to lora_outputs/checkpoint-600 Train Steps: 88%|█████▎| 700/800 [08:49<00:56, 1.78it/s, epoch=0116, lr=0.0001, step_loss=0.00746] Saved lora weights to lora_outputs/checkpoint-700 Train Steps: 100%|███████| 800/800 [10:08<00:00, 1.74it/s, epoch=0133, lr=0.0001, step_loss=0.0411] Saved lora weights to lora_outputs/checkpoint-800 Model weights saved in lora_outputs/paddle_lora_weights.pdparams Train Steps: 100%|███████| 800/800 [11:05<00:00, 1.20it/s, epoch=0133, lr=0.0001, step_loss=0.0411] [0m
2.3 挑选满意的权重上传至Huggingface
参数解释:
- upload_dir:我们需要上传的文件夹目录。
- repo_name:我们需要上传的repo名称,最终我们会上传到 huggingface.co/{你的用户名}/{你指… 例如: huggingface.co/junnyu/lora….
- pretrained_model_name_or_path:训练该模型所使用的基础模型。
- prompt:搭配该权重需要使用的Prompt文本。
from utils import upload_lora_folder upload_dir = "lora_outputs" # 我们需要上传的文件夹目录 repo_name = "Chinese_ShanShui_Style" # 我们需要上传的repo名称 pretrained_model_name_or_path = "runwayml/stable-diffusion-v1-5" # 训练该模型所使用的基础模型 prompt = "Chinese_ShanShui_Style" # 搭配该权重需要使用的Prompt文本 upload_lora_folder( upload_dir=upload_dir, repo_name=repo_name, pretrained_model_name_or_path=pretrained_model_name_or_path, prompt=prompt, )
Pushing to livingbody/Chinese_ShanShui_Style Upload 1 LFS files: 0%| | 0/1 [00:00<?, ?it/s] paddle_lora_weights.pdparams: 0%| | 0.00/3.23M [00:00<?, ?B/s]
2.4 再生成一张
from ppdiffusers import DiffusionPipeline, DPMSolverMultistepScheduler import paddle pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5") pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config) pipe.unet.load_attn_procs("lora_outputs/", from_hf_hub=True) prompt = "2 man are walking in the woods with Chinese_ShanShui_Style" image = pipe(prompt, num_inference_steps=25).images[0] image.save("demo.png")
[2023-03-23 17:07:14,171] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/model_index.json [2023-03-23 17:07:14,176] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/vae/model_state.pdparams [2023-03-23 17:07:14,179] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/vae/config.json [2023-03-23 17:07:14,870] [ INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/config.json [2023-03-23 17:07:14,875] [ INFO] - loading configuration file /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/config.json [2023-03-23 17:07:14,878] [ INFO] - Model config CLIPVisionConfig { "architectures": [ "StableDiffusionSafetyChecker" ], "attention_dropout": 0.0, "dropout": 0.0, "hidden_act": "quick_gelu", "hidden_size": 1024, "image_size": 224, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-05, "model_type": "clip_vision_model", "num_attention_heads": 16, "num_channels": 3, "num_hidden_layers": 24, "paddlenlp_version": null, "patch_size": 14, "projection_dim": 768, "return_dict": true } [2023-03-23 17:07:14,987] [ INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/safety_checker/model_state.pdparams [2023-03-23 17:07:17,520] [ INFO] - All model checkpoint weights were used when initializing StableDiffusionSafetyChecker. [2023-03-23 17:07:17,525] [ INFO] - All the weights of StableDiffusionSafetyChecker were initialized from the model checkpoint at runwayml/stable-diffusion-v1-5/safety_checker. If your task is similar to the task the model of the checkpoint was trained on, you can already use StableDiffusionSafetyChecker for predictions without further training. [2023-03-23 17:07:17,531] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/vocab.json [2023-03-23 17:07:17,533] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/merges.txt [2023-03-23 17:07:17,536] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/added_tokens.json [2023-03-23 17:07:17,538] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/special_tokens_map.json [2023-03-23 17:07:17,541] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/tokenizer/tokenizer_config.json [2023-03-23 17:07:17,724] [ INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/config.json [2023-03-23 17:07:17,728] [ INFO] - loading configuration file /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/config.json [2023-03-23 17:07:17,731] [ INFO] - Model config CLIPTextConfig { "_name_or_path": "openai/clip-vit-large-patch14", "architectures": [ "CLIPTextModel" ], "attention_dropout": 0.0, "bos_token_id": 0, "dropout": 0.0, "eos_token_id": 2, "hidden_act": "quick_gelu", "hidden_size": 768, "initializer_factor": 1.0, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 77, "model_type": "clip_text_model", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "paddlenlp_version": null, "projection_dim": 512, "return_dict": true, "torch_dtype": "float32", "transformers_version": "4.21.0.dev0", "vocab_size": 49408 } [2023-03-23 17:07:17,891] [ INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/text_encoder/model_state.pdparams [2023-03-23 17:07:18,926] [ INFO] - All model checkpoint weights were used when initializing CLIPTextModel. [2023-03-23 17:07:18,930] [ INFO] - All the weights of CLIPTextModel were initialized from the model checkpoint at runwayml/stable-diffusion-v1-5/text_encoder. If your task is similar to the task the model of the checkpoint was trained on, you can already use CLIPTextModel for predictions without further training. [2023-03-23 17:07:18,936] [ INFO] - Found /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json [2023-03-23 17:07:18,940] [ INFO] - loading configuration file https://bj.bcebos.com/paddlenlp/models/community/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json from cache at /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/feature_extractor/preprocessor_config.json [2023-03-23 17:07:18,943] [ INFO] - size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'shortest_edge': 224}. [2023-03-23 17:07:18,946] [ INFO] - crop_size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'height': 224, 'width': 224}. [2023-03-23 17:07:18,949] [ INFO] - Image processor CLIPFeatureExtractor { "crop_size": { "height": 224, "width": 224 }, "do_center_crop": true, "do_convert_rgb": true, "do_normalize": true, "do_rescale": true, "do_resize": true, "feature_extractor_type": "CLIPFeatureExtractor", "image_mean": [ 0.48145466, 0.4578275, 0.40821073 ], "image_processor_type": "CLIPFeatureExtractor", "image_std": [ 0.26862954, 0.26130258, 0.27577711 ], "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "shortest_edge": 224 } } [2023-03-23 17:07:18,951] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/unet/model_state.pdparams [2023-03-23 17:07:18,954] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/unet/config.json [2023-03-23 17:07:28,517] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/runwayml/stable-diffusion-v1-5/scheduler/scheduler_config.json 0%| | 0/25 [00:00<?, ?it/s]
代码如下:aistudio.baidu.com/aistudio/pr…