一、环境设置
话不多说,主要做以下工作。
1.PaddleSeg下载
git下载最新版PaddleSeg2.6版本。
!git clone https://gitee.com/paddlepaddle/PaddleSeg.git --depth=1
2.升级PaddePaddle-gpu
!pip install -U paddlepaddle-gpu >log.log
3.PaddleSeg 安装
!pip install -e ~/PaddleSeg >log.log
4.Matting 安装
!pip install -r ~/PaddleSeg/Matting/requirements.txt >log.log
二、视频切分转图片
主要是以下功能
- 从摄像头获取视频切分为图片帧
- 从视频文件获取视频切分为图片帧
# 引入必要的包 import cv2 import os import numpy as np from PIL import Image
1.视频转图片
# 视频转图片 def video2image(video_path, img_path): cap = cv2.VideoCapture(video_path) index = 0 while(True): ret,frame = cap.read() if ret: cv2.imwrite('%s/%d.jpg'%(img_path, index), frame) index += 1 else: break cap.release() print('Video cut finish, all %d frame' % index)
2.原图
!mkdir 1_img
# 切阿里木视频 video2image('video/1video.mp4', '1_img')
3.背景图
!mkdir zjj
# 切风景视频
三、PPHumanMatting人像抠图
1.人像抠图简介
Image Matting(精细化分割/影像去背/抠图)是指借由计算前景的颜色和透明度,将前景从影像中撷取出来的技术,可用于替换背景、影像合成、视觉特效,在电影工业中被广泛地使用。影像中的每个像素会有代表其前景透明度的值,称作阿法值(Alpha),一张影像中所有阿法值的集合称作阿法遮罩(Alpha Matte),将影像被遮罩所涵盖的部分取出即可完成前景的分离。
提供多种场景人像抠图模型, 可根据实际情况选择相应模型,我们提供了Inference Model,您可直接下载进行部署应用。
模型推荐:
- 追求精度:PP-Matting, 低分辨率使用PP-Matting-512, 高分辨率使用PP-Matting-1024。
- 追求速度:ModNet-MobileNetV2。
- 高分辨率(>2048)简单背景人像抠图:PP-HumanMatting。
- 提供trimap:DIM-VGG16。
| 模型 | Params(M) | FLOPs(G) | FPS | Checkpoint | Inference Model |
| PP-Matting-512 | 24.5 | 91.28 | 28.9 | model | model inference |
| PP-Matting-1024 | 24.5 | 91.28 | 13.4(1024X1024) | model | model inference |
| PP-HumanMatting | 63.9 | 135.8 (2048X2048) | 32.8(2048X2048) | model | model inference |
| ModNet-MobileNetV2 | 6.5 | 15.7 | 68.4 | model | model inference |
| ModNet-ResNet50_vd | 92.2 | 151.6 | 29.0 | model | model inference |
| ModNet-HRNet_W18 | 10.2 | 28.5 | 62.6 | model | model inference |
| DIM-VGG16 | 28.4 | 175.5 | 30.4 | model | model inference |
2.下载预训练模型
%cd ~/PaddleSeg/Matting !wget https://paddleseg.bj.bcebos.com/matting/models/human_matting-resnet34_vd.pdparams
2.抠图示例
%cd ~/PaddleSeg/Matting !python tools/predict.py \ --config ./configs/human_matting/human_matting-resnet34_vd.yml \ --model_path ./human_matting-resnet34_vd.pdparams \ --image_path ~/aa/ \ --save_dir ~/output/results
/home/aistudio/PaddleSeg/Matting 2022-08-25 19:27:30 [INFO] ---------------Config Information--------------- batch_size: 4 iters: 50000 lr_scheduler: boundaries: - 30000 - 40000 type: PiecewiseDecay values: - 0.001 - 0.0001 - 1.0e-05 model: backbone: pretrained: https://paddleseg.bj.bcebos.com/matting/models/ResNet34_vd_pretrained/model.pdparams type: ResNet34_vd if_refine: true pretrained: null type: HumanMatting optimizer: momentum: 0.9 type: sgd weight_decay: 4.0e-05 train_dataset: dataset_root: data/PPM-100 mode: train train_file: train.txt transforms: - type: LoadImages - scale: - 0.3 - 1.5 size: - 2048 - 2048 type: RandomResize - crop_size: - 2048 - 2048 type: RandomCrop - type: RandomDistort - prob: 0.1 type: RandomBlur - type: RandomHorizontalFlip - target_size: - 2048 - 2048 type: Padding - type: Normalize type: MattingDataset val_dataset: dataset_root: data/PPM-100 get_trimap: false mode: val transforms: - type: LoadImages - short_size: 2048 type: ResizeByShort - mult_int: 128 type: ResizeToIntMult - type: Normalize type: MattingDataset val_file: val.txt ------------------------------------------------ /home/aistudio/PaddleSeg/paddleseg/cvlibs/config.py:341: UserWarning: `dataset_root` is not found. Is it correct? warnings.warn("`dataset_root` is not found. Is it correct?") W0825 19:27:30.818305 542 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1 W0825 19:27:30.818343 542 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6. 2022-08-25 19:27:32 [INFO] Loading pretrained model from https://paddleseg.bj.bcebos.com/matting/models/ResNet34_vd_pretrained/model.pdparams Connecting to https://paddleseg.bj.bcebos.com/matting/models/ResNet34_vd_pretrained/model.pdparams Downloading model.pdparams [==================================================] 100.00% 2022-08-25 19:27:34 [INFO] There are 195/195 variables loaded into ResNet_vd. 2022-08-25 19:27:34 [INFO] Number of predict images = 1 2022-08-25 19:27:34 [INFO] Loading pretrained model from ./human_matting-resnet34_vd.pdparams 2022-08-25 19:27:35 [INFO] There are 486/486 variables loaded into HumanMatting. 2022-08-25 19:27:35 [INFO] Start to predict... 1/1 [==============================] - 2s 2s/step - preprocess_cost: 0.3376 - infer_cost cost: 1.0974 - postprocess_cost: 0.3286 [0m
3.批量抠图
%cd ~/PaddleSeg/Matting !python tools/predict.py \ --config ./configs/human_matting/human_matting-resnet34_vd.yml \ --model_path ./human_matting-resnet34_vd.pdparams \ --image_path ~/1_img/ \ --save_dir ~/output/results
/home/aistudio/PaddleSeg/Matting /home/aistudio/PaddleSeg/Matting 2022-08-25 19:28:23 [INFO] ---------------Config Information--------------- batch_size: 4 iters: 50000 lr_scheduler: boundaries: - 30000 - 40000 type: PiecewiseDecay values: - 0.001 - 0.0001 - 1.0e-05 model: backbone: pretrained: https://paddleseg.bj.bcebos.com/matting/models/ResNet34_vd_pretrained/model.pdparams type: ResNet34_vd if_refine: true pretrained: null type: HumanMatting optimizer: momentum: 0.9 type: sgd weight_decay: 4.0e-05 train_dataset: dataset_root: data/PPM-100 mode: train train_file: train.txt transforms: - type: LoadImages - scale: - 0.3 - 1.5 size: - 2048 - 2048 type: RandomResize - crop_size: - 2048 - 2048 type: RandomCrop - type: RandomDistort - prob: 0.1 type: RandomBlur - type: RandomHorizontalFlip - target_size: - 2048 - 2048 type: Padding - type: Normalize type: MattingDataset val_dataset: dataset_root: data/PPM-100 get_trimap: false mode: val transforms: - type: LoadImages - short_size: 2048 type: ResizeByShort - mult_int: 128 type: ResizeToIntMult - type: Normalize type: MattingDataset val_file: val.txt ------------------------------------------------ /home/aistudio/PaddleSeg/paddleseg/cvlibs/config.py:341: UserWarning: `dataset_root` is not found. Is it correct? warnings.warn("`dataset_root` is not found. Is it correct?") W0825 19:28:23.351861 789 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1 W0825 19:28:23.351899 789 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6. 2022-08-25 19:28:24 [INFO] Loading pretrained model from https://paddleseg.bj.bcebos.com/matting/models/ResNet34_vd_pretrained/model.pdparams 2022-08-25 19:28:25 [INFO] There are 195/195 variables loaded into ResNet_vd. 2022-08-25 19:28:25 [INFO] Number of predict images = 1024 2022-08-25 19:28:25 [INFO] Loading pretrained model from ./human_matting-resnet34_vd.pdparams 2022-08-25 19:28:25 [INFO] There are 486/486 variables loaded into HumanMatting. 2022-08-25 19:28:25 [INFO] Start to predict... 1024/1024 [==============================] - 745s 727ms/step - preprocess_cost: 0.3499 - infer_cost cost: 0.0576 - postprocess_cost: 0.31
四、人像抠图
1.合并图像
def BlendImg(fore_image, base_image, output_path): """ 将抠出的人物图像换背景 fore_image: 前景图片,抠出的人物图片 base_image: 背景图片 """ # 读入图片 base_image = Image.open(base_image).convert('RGB') fore_image = Image.open(fore_image).resize(base_image.size) # 图片加权合成 scope_map = np.array(fore_image)[:,:,-1] / 255 scope_map = scope_map[:,:,np.newaxis] scope_map = np.repeat(scope_map, repeats=3, axis=2) res_image = np.multiply(scope_map, np.array(fore_image)[:,:,:3]) + np.multiply((1-scope_map), np.array(base_image)) #保存图片 res_image = Image.fromarray(np.uint8(res_image)) res_image.save(output_path) def BlendHumanImg(in_path, screen_path, out_path): img_len=len(os.listdir(in_path))//2 print(img_len) for i in range(img_len): img_path = os.path.join(in_path , '%d_rgba.png' % (i)) screen_path_file = os.path.join(screen_path ,'%d.jpg' % (i)) output_path_img = os.path.join(out_path ,'%d.png' % i) BlendImg(img_path, screen_path_file, output_path_img) def init_canvas(width, height, color=(255, 255, 255)): canvas = np.ones((height, width, 3), dtype="uint8") canvas[:] = color return canvas def GetGreenScreen(width, height, out_path): canvas = init_canvas(width, height, color=(0, 255, 0)) cv2.imwrite(out_path, canvas)
2.合并人物和背景图
%cd ~ import shutil blend_path='blend' FrameSeg_Path='output/results' GreenScreen_Path='zjj' if not os.path.exists(blend_path): os.mkdir(blend_path) else: shutil.rmtree(blend_path) BlendHumanImg(FrameSeg_Path, GreenScreen_Path, blend_path)
/home/aistudio 1024
2.合并视频
import cv2 import os def CombVideo(in_path, out_path, size): fourcc = cv2.VideoWriter_fourcc(*'mp4v') # 第一个参数是要保存的文件的路径 # fourcc 指定编码器 # fps 要保存的视频的帧率 # frameSize 要保存的文件的画面尺寸 # isColor 指示是黑白画面还是彩色的画面 out = cv2.VideoWriter(out_path,fourcc, 30.0, size, True) files = os.listdir(in_path) for i in range(len(files)): filename=os.path.join(in_path , '%d.png' % i) # print(filename) img = cv2.imread(filename) out.write(img) #保存帧 out.release()
ComOut_Path='./combine.mp4' blend_path='blend' # 需要注意图像尺寸,应和图片尺寸保持一致,否则生成视频很小,无法播放 CombVideo(blend_path, ComOut_Path,(720,1280))
3.提取原视频音频
!pip install moviepy -q
import moviepy.editor as mp # 采样率16k 保证和paddlespeech一致 def extract_audio(videos_file_path): my_clip = mp.VideoFileClip(videos_file_path,audio_fps=16000) if (videos_file_path.split(".")[-1] == 'MP4' or videos_file_path.split(".")[-1] == 'mp4'): p = videos_file_path.split('.MP4')[0] my_clip.audio.write_audiofile(p + '_video.wav') new_path = p + '_video.wav' return new_path
extract_audio('video/1video.mp4')
MoviePy - Writing audio in video/1video.mp4_video.wav MoviePy - Done. 'video/1video.mp4_video.wav'
4.加入原音
from moviepy.editor import * """ 为视频添加一个背景音乐 多轨音频合成 """ #需添加背景音乐的视频 video_clip = VideoFileClip(r'combine.mp4') #提取视频对应的音频,并调节音量 # video_audio_clip = video_clip.audio.volumex(0.8) #背景音乐 audio_clip = AudioFileClip(r'video/1video.mp4_video.wav').volumex(1) #设置背景音乐循环,时间与视频时间一致 # audio = afx.audio_loop( audio_clip, duration=video_clip.duration) #视频声音和背景音乐,音频叠加 # audio_clip_add = CompositeAudioClip([video_audio_clip,audio]) #视频写入背景音 final_video = video_clip.set_audio(audio_clip) #将处理完成的视频保存 final_video.write_videofile("video_result.mp4")
chunk: 6%|▌ | 44/753 [00:00<00:01, 435.00it/s, now=None] Moviepy - Building video video_result.mp4. MoviePy - Writing audio in video_resultTEMP_MPY_wvf_snd.mp3 chunk: 18%|█▊ | 136/753 [00:00<00:01, 443.45it/s, now=None] t: 2%|▏ | 16/1024 [00:00<00:06, 159.13it/s, now=None] MoviePy - Done. Moviepy - Writing video video_result.mp4 t: 3%|▎ | 35/1024 [00:00<00:05, 166.15it/s, now=None] Moviepy - Done ! Moviepy - video ready video_result.mp4
五、注意事项
该方法能够较为逼真合成视频,主要得益于 PPHumanMatting 出色的人像抠图能力,此外需要注意的是,个别帧由于任务不全无法识别人物,那么可以采用人梯检测,如未检测到人物,则丢弃该帧。





