一、基于PaddleClas的NUS-WIDE-SCENE多标签图像分类
1.情况简介
该项目基于PaddleClas,主要完成多标签分类的训练、评估、预测的体验过程。
2.数据集
该项目数据集为NUS-WIDE-SCENE的子集,需要对图像进行分类,具有36个标签。
- 该子集下载地址: paddle-imagenet-models-name.bj.bcebos.com/data/NUS-SC…
- NUS-WIDE-SCENE数据集下载地址:lms.comp.nus.edu.sg/wp-content/…
标签有:
airport beach bridge buildings castle cityscape clouds frost
- garden glacier grass harbor house lake moon mountain nighttime ocean plants railroad rainbow reflection road sky snow street sunset temple town valley water waterfall window
二、PaddleClas安装
1.PaddleClas下载
从gitee下载,具有较快的速度,同时depth=1,只下载默认的版本。
!git clone https://gitee.com/paddlepaddle/PaddleClas.git --depth=1
Cloning into 'PaddleClas'... remote: Enumerating objects: 2019, done.[K remote: Counting objects: 100% (2019/2019), done.[K remote: Compressing objects: 100% (1256/1256), done.[K remote: Total 2019 (delta 1001), reused 1333 (delta 725), pack-reused 0[K Receiving objects: 100% (2019/2019), 86.17 MiB | 7.50 MiB/s, done. Resolving deltas: 100% (1001/1001), done. Checking connectivity... done.
2.PaddleClas安装
主要完成相关依赖库安装等
!pip install -r ~/PaddleClas/requirements.txt >log.log !pip install -e ~/PaddleClas >log.log
[33mWARNING: You are using pip version 22.0.4; however, version 22.1.2 is available. You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.[0m[33m [0m[33mWARNING: You are using pip version 22.0.4; however, version 22.1.2 is available. You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.[0m[33m [0m
三、数据集准备
1.数据解压缩
主要完成 数据集下载、解压缩 等。
%cd ~/PaddleClas !mkdir dataset/NUS-WIDE-SCENE %cd dataset/NUS-WIDE-SCENE !wget https://paddle-imagenet-models-name.bj.bcebos.com/data/NUS-SCENE-dataset.tar !tar -xf NUS-SCENE-dataset.tar
/home/aistudio/PaddleClas /home/aistudio/PaddleClas/dataset/NUS-WIDE-SCENE --2022-06-20 11:10:05-- https://paddle-imagenet-models-name.bj.bcebos.com/data/NUS-SCENE-dataset.tar Resolving paddle-imagenet-models-name.bj.bcebos.com (paddle-imagenet-models-name.bj.bcebos.com)... 182.61.200.195, 182.61.200.229, 2409:8c04:1001:1002:0:ff:b001:368a Connecting to paddle-imagenet-models-name.bj.bcebos.com (paddle-imagenet-models-name.bj.bcebos.com)|182.61.200.195|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 810639872 (773M) [application/x-tar] Saving to: ‘NUS-SCENE-dataset.tar’ NUS-SCENE-dataset.t 100%[===================>] 773.09M 36.1MB/s in 19s 2022-06-20 11:10:24 (39.7 MB/s) - ‘NUS-SCENE-dataset.tar’ saved [810639872/810639872]
2.数据查看
其中第一列为图像文件名,其后36列分别为garden glacier grass harbor house lake moon mountain nighttime ocean plants railroad rainbow reflection road sky snow street sunset temple town valley water waterfall window 标签,为1则是,0否。
!head NUS-SCENE-dataset/multilabel_train_list.txt
0045_845243484.jpg 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0 0229_433478352.jpg 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0 0322_2093820806.jpg 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0 0463_2483322510.jpg 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0517_2283920455.jpg 0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0 0006_2074187535.jpg 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0034_509197470.jpg 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0064_2591840477.jpg 0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0 0208_465647043.jpg 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1 0211_2490834700.jpg 0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0
from PIL import Image %cd ~ img=Image.open("PaddleClas/dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/0006_2074187535.jpg") img.show()
/home/aistudio
四、模型训练
1.训练配置
配置文件为 PaddleClas/ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml
# global configs Global: checkpoints: null pretrained_model: null output_dir: ./output/ device: gpu save_interval: 1 eval_during_train: True eval_interval: 1 epochs: 10 print_batch_step: 10 use_visualdl: True # used for static mode and model export image_shape: [3, 224, 224] save_inference_dir: ./inference use_multilabel: True # model architecture Arch: name: MobileNetV1 class_num: 33 pretrained: True # loss function config for traing/eval process Loss: Train: - MultiLabelLoss: weight: 1.0 Eval: - MultiLabelLoss: weight: 1.0 Optimizer: name: Momentum momentum: 0.9 lr: name: Cosine learning_rate: 0.1 regularizer: name: 'L2' coeff: 0.00004 # data loader for train and eval DataLoader: Train: dataset: name: MultiLabelDataset image_root: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/ cls_label_path: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/multilabel_train_list.txt transform_ops: - DecodeImage: to_rgb: True channel_first: False - RandCropImage: size: 224 - RandFlipImage: flip_code: 1 - NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' sampler: name: DistributedBatchSampler batch_size: 256 drop_last: False shuffle: True loader: num_workers: 0 use_shared_memory: True Eval: dataset: name: MultiLabelDataset image_root: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/ cls_label_path: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/multilabel_test_list.txt transform_ops: - DecodeImage: to_rgb: True channel_first: False - ResizeImage: resize_short: 256 - CropImage: size: 224 - NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' sampler: name: DistributedBatchSampler batch_size: 256 drop_last: False shuffle: False loader: num_workers: 0 use_shared_memory: True Infer: infer_imgs: ./deploy/images/0517_2715693311.jpg batch_size: 10 transforms: - DecodeImage: to_rgb: True channel_first: False - ResizeImage: resize_short: 256 - CropImage: size: 224 - NormalizeImage: scale: 1.0/255.0 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' - ToCHWImage: PostProcess: name: MultiLabelTopk topk: 5 class_id_map_file: None Metric: Train: - HammingDistance: - AccuracyScore: Eval: - HammingDistance: - AccuracyScore:
2.bug修复
PaddleClas/ppcls/data/dataloader/multilabel_dataset.py的 label_ration 的bug修复,具体如下。
from __future__ import print_function import numpy as np import os import cv2 from ppcls.data.preprocess import transform from ppcls.utils import logger from .common_dataset import CommonDataset class MultiLabelDataset(CommonDataset): def _load_anno(self, label_ratio=False): assert os.path.exists(self._cls_path) assert os.path.exists(self._img_root) self.images = [] self.labels = [] with open(self._cls_path) as fd: lines = fd.readlines() for l in lines: l = l.strip().split("\t") self.images.append(os.path.join(self._img_root, l[0])) labels = l[1].split(',') labels = [np.int64(i) for i in labels] self.labels.append(labels) assert os.path.exists(self.images[-1]) #label_ration, 加赋值、判断 self.label_ratio=label_ratio if label_ratio: return np.array(self.labels).mean(0).astype("float32") def __getitem__(self, idx): try: with open(self.images[idx], 'rb') as f: img = f.read() if self._transform_ops: img = transform(img, self._transform_ops) img = img.transpose((2, 0, 1)) label = np.array(self.labels[idx]).astype("float32") # 这边判断依旧是,因为默认False,二用None判断会出错。 # if self.label_ratio is not None: if self.label_ratio: return (img, np.array([label, self.label_ratio])) else: return (img, label) except Exception as ex: logger.error("Exception occured when parse line: {} with msg: {}". format(self.images[idx], ex)) rnd_idx = np.random.randint(self.__len__()) return self.__getitem__(rnd_idx)
!cp multilabel_dataset.py PaddleClas/ppcls/data/dataloader/multilabel_dataset.py -rf
3.开始训练
%cd ~/PaddleClas/ !python3 tools/train.py \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml
训练日志
[2022/06/20 12:03:24] ppcls INFO: [Train][Epoch 10/10][Avg]HammingDistance: 0.05218, AccuracyScore: 0.94782, MultiLabelLoss: 0.13593, loss: 0.13593 [2022/06/20 12:03:24] ppcls INFO: [Eval][Epoch 10][Iter: 0/69]MultiLabelLoss: 0.10388, loss: 0.10388, HammingDistance: 0.03741, AccuracyScore: 0.96259, batch_cost: 0.89574s, reader_cost: 0.81882, ips: 285.79827 images/sec [2022/06/20 12:03:32] ppcls INFO: [Eval][Epoch 10][Iter: 10/69]MultiLabelLoss: 0.13586, loss: 0.13586, HammingDistance: 0.05322, AccuracyScore: 0.94678, batch_cost: 0.79895s, reader_cost: 0.72159, ips: 320.42090 images/sec [2022/06/20 12:03:41] ppcls INFO: [Eval][Epoch 10][Iter: 20/69]MultiLabelLoss: 0.12612, loss: 0.12612, HammingDistance: 0.05114, AccuracyScore: 0.94886, batch_cost: 0.82050s, reader_cost: 0.74314, ips: 312.00325 images/sec [2022/06/20 12:03:48] ppcls INFO: [Eval][Epoch 10][Iter: 30/69]MultiLabelLoss: 0.13136, loss: 0.13136, HammingDistance: 0.05045, AccuracyScore: 0.94955, batch_cost: 0.80221s, reader_cost: 0.72489, ips: 319.11937 images/sec [2022/06/20 12:03:56] ppcls INFO: [Eval][Epoch 10][Iter: 40/69]MultiLabelLoss: 0.12647, loss: 0.12647, HammingDistance: 0.05075, AccuracyScore: 0.94925, batch_cost: 0.79200s, reader_cost: 0.71474, ips: 323.23343 images/sec [2022/06/20 12:04:04] ppcls INFO: [Eval][Epoch 10][Iter: 50/69]MultiLabelLoss: 0.11195, loss: 0.11195, HammingDistance: 0.05035, AccuracyScore: 0.94965, batch_cost: 0.78639s, reader_cost: 0.70924, ips: 325.53638 images/sec [2022/06/20 12:04:11] ppcls INFO: [Eval][Epoch 10][Iter: 60/69]MultiLabelLoss: 0.11939, loss: 0.11939, HammingDistance: 0.05029, AccuracyScore: 0.94971, batch_cost: 0.78139s, reader_cost: 0.70429, ips: 327.62312 images/sec [2022/06/20 12:04:17] ppcls INFO: [Eval][Epoch 10][Avg]MultiLabelLoss: 0.12980, loss: 0.12980, HammingDistance: 0.05005, AccuracyScore: 0.94995
vdl图例
五、模型评估
最终评估结果: [Eval][Epoch 0][Avg]MultiLabelLoss: 0.16364, loss: 0.16364, HammingDistance: 0.05834, AccuracyScore: 0.94166
!python3 tools/eval.py \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \ -o Arch.pretrained="./output/MobileNetV1/best_model"
六、模型预测
通过预测,图像
最终预测结果: [{'class_ids': [6, 13, 23, 30], 'scores': [0.97452, 0.59816, 0.98675, 0.81546], 'file_name': './deploy/images/0517_2715693311.jpg', 'label_names': []}]
即:clouds、lake、sky、water
!python3 tools/infer.py \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \ -o Arch.pretrained="./output/MobileNetV1/best_model"
七、基于预测引擎预测
1.导出 inference model
!python3 tools/export_model.py \ -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \ -o Arch.pretrained="./output/MobileNetV1/best_model"
inference model 的路径默认在当前路径下 ./inference
%cd ~/PaddleClas !ls ./inference -l
/home/aistudio/PaddleClas total 13312 -rw-r--r-- 1 aistudio aistudio 13054335 Jun 20 14:22 inference.pdiparams -rw-r--r-- 1 aistudio aistudio 12364 Jun 20 14:22 inference.pdiparams.info -rw-r--r-- 1 aistudio aistudio 554665 Jun 20 14:22 inference.pdmodel
2 基于预测引擎预测
- 首先进入 deploy 目录
- 通过预测引擎推理预测
预测配置文件PaddleClas/deploy/configs/inference_cls_multilabel.yaml
Global: infer_imgs: "./images/0517_2715693311.jpg" inference_model_dir: "../inference/" batch_size: 1 use_gpu: True enable_mkldnn: False cpu_num_threads: 10 enable_benchmark: True use_fp16: False ir_optim: True use_tensorrt: False gpu_mem: 8000 enable_profile: False PreProcess: transform_ops: - ResizeImage: resize_short: 256 - CropImage: size: 224 - NormalizeImage: scale: 0.00392157 mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: '' channel_num: 3 - ToCHWImage: PostProcess: main_indicator: MultiLabelTopk MultiLabelTopk: topk: 5 class_id_map_file: None SavePreLabel: save_dir: ./pre_label/
%cd ~/PaddleClas/deploy !python3 python/predict_cls.py \ -c ./configs/inference_cls_multilabel.yaml
/home/aistudio/PaddleClas/deploy 2022-06-20 15:07:11 INFO: =========================================================== == PaddleClas is powered by PaddlePaddle ! == =========================================================== == == == For more info please go to the following website. == == == == https://github.com/PaddlePaddle/PaddleClas == =========================================================== 2022-06-20 15:07:11 INFO: Global : 2022-06-20 15:07:11 INFO: batch_size : 1 2022-06-20 15:07:11 INFO: cpu_num_threads : 10 2022-06-20 15:07:11 INFO: enable_benchmark : True 2022-06-20 15:07:11 INFO: enable_mkldnn : False 2022-06-20 15:07:11 INFO: enable_profile : False 2022-06-20 15:07:11 INFO: gpu_mem : 8000 2022-06-20 15:07:11 INFO: infer_imgs : ./images/0517_2715693311.jpg 2022-06-20 15:07:11 INFO: inference_model_dir : ../inference/ 2022-06-20 15:07:11 INFO: ir_optim : True 2022-06-20 15:07:11 INFO: use_fp16 : False 2022-06-20 15:07:11 INFO: use_gpu : True 2022-06-20 15:07:11 INFO: use_tensorrt : False 2022-06-20 15:07:11 INFO: PostProcess : 2022-06-20 15:07:11 INFO: MultiLabelTopk : 2022-06-20 15:07:11 INFO: class_id_map_file : None 2022-06-20 15:07:11 INFO: topk : 5 2022-06-20 15:07:11 INFO: SavePreLabel : 2022-06-20 15:07:11 INFO: save_dir : ./pre_label/ 2022-06-20 15:07:11 INFO: main_indicator : MultiLabelTopk 2022-06-20 15:07:11 INFO: PreProcess : 2022-06-20 15:07:11 INFO: transform_ops : 2022-06-20 15:07:11 INFO: ResizeImage : 2022-06-20 15:07:11 INFO: resize_short : 256 2022-06-20 15:07:11 INFO: CropImage : 2022-06-20 15:07:11 INFO: size : 224 2022-06-20 15:07:11 INFO: NormalizeImage : 2022-06-20 15:07:11 INFO: channel_num : 3 2022-06-20 15:07:11 INFO: mean : [0.485, 0.456, 0.406] 2022-06-20 15:07:11 INFO: order : 2022-06-20 15:07:11 INFO: scale : 0.00392157 2022-06-20 15:07:11 INFO: std : [0.229, 0.224, 0.225] 2022-06-20 15:07:11 INFO: ToCHWImage : None 0517_2715693311.jpg: class id(s): [23], score(s): [0.62], label_name(s): []
八、总结
多标签图像分类在日常生活中很常见,例如年初的 天气以及时间分类比赛www.datafountain.cn/competition…等,飞桨提供了端到端全流程预测工具,极大缩减了训练成本呢。