基于PaddleClas的NUS-WIDE-SCENE多标签图像分类

简介: 基于PaddleClas的NUS-WIDE-SCENE多标签图像分类

一、基于PaddleClas的NUS-WIDE-SCENE多标签图像分类


1.情况简介


该项目基于PaddleClas,主要完成多标签分类的训练、评估、预测的体验过程。


2.数据集


该项目数据集为NUS-WIDE-SCENE的子集,需要对图像进行分类,具有36个标签。

image.png


标签有:

airport beach bridge buildings castle cityscape clouds frost

  • garden glacier grass harbor house lake moon mountain nighttime ocean plants railroad rainbow reflection road sky snow street sunset temple town valley water waterfall window


二、PaddleClas安装


1.PaddleClas下载


从gitee下载,具有较快的速度,同时depth=1,只下载默认的版本。

!git clone https://gitee.com/paddlepaddle/PaddleClas.git --depth=1
Cloning into 'PaddleClas'...
remote: Enumerating objects: 2019, done.
remote: Counting objects: 100% (2019/2019), done.
remote: Compressing objects: 100% (1256/1256), done.
remote: Total 2019 (delta 1001), reused 1333 (delta 725), pack-reused 0
Receiving objects: 100% (2019/2019), 86.17 MiB | 7.50 MiB/s, done.
Resolving deltas: 100% (1001/1001), done.
Checking connectivity... done.


2.PaddleClas安装


主要完成相关依赖库安装等

!pip install -r ~/PaddleClas/requirements.txt >log.log
!pip install -e ~/PaddleClas >log.log
WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.
WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.



三、数据集准备


1.数据解压缩


主要完成 数据集下载、解压缩 等。

%cd ~/PaddleClas
!mkdir dataset/NUS-WIDE-SCENE
%cd dataset/NUS-WIDE-SCENE
!wget https://paddle-imagenet-models-name.bj.bcebos.com/data/NUS-SCENE-dataset.tar
!tar -xf NUS-SCENE-dataset.tar
/home/aistudio/PaddleClas
/home/aistudio/PaddleClas/dataset/NUS-WIDE-SCENE
--2022-06-20 11:10:05--  https://paddle-imagenet-models-name.bj.bcebos.com/data/NUS-SCENE-dataset.tar
Resolving paddle-imagenet-models-name.bj.bcebos.com (paddle-imagenet-models-name.bj.bcebos.com)... 182.61.200.195, 182.61.200.229, 2409:8c04:1001:1002:0:ff:b001:368a
Connecting to paddle-imagenet-models-name.bj.bcebos.com (paddle-imagenet-models-name.bj.bcebos.com)|182.61.200.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 810639872 (773M) [application/x-tar]
Saving to: ‘NUS-SCENE-dataset.tar’
NUS-SCENE-dataset.t 100%[===================>] 773.09M  36.1MB/s    in 19s     
2022-06-20 11:10:24 (39.7 MB/s) - ‘NUS-SCENE-dataset.tar’ saved [810639872/810639872]


2.数据查看


其中第一列为图像文件名,其后36列分别为garden glacier grass harbor house lake moon mountain nighttime ocean plants railroad rainbow reflection road sky snow street sunset temple town valley water waterfall window 标签,为1则是,0否。

!head NUS-SCENE-dataset/multilabel_train_list.txt
0045_845243484.jpg  0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
0229_433478352.jpg  0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0
0322_2093820806.jpg 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0
0463_2483322510.jpg 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0517_2283920455.jpg 0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0
0006_2074187535.jpg 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0034_509197470.jpg  0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0064_2591840477.jpg 0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0
0208_465647043.jpg  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1
0211_2490834700.jpg 0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0
from PIL import Image
%cd ~
img=Image.open("PaddleClas/dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/0006_2074187535.jpg")
img.show()
/home/aistudio


四、模型训练


1.训练配置


配置文件为 PaddleClas/ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml

# global configs
Global:
  checkpoints: null
  pretrained_model: null
  output_dir: ./output/
  device: gpu
  save_interval: 1
  eval_during_train: True
  eval_interval: 1
  epochs: 10
  print_batch_step: 10
  use_visualdl: True
  # used for static mode and model export
  image_shape: [3, 224, 224]
  save_inference_dir: ./inference
  use_multilabel: True
# model architecture
Arch:
  name: MobileNetV1
  class_num: 33
  pretrained: True
# loss function config for traing/eval process
Loss:
  Train:
    - MultiLabelLoss:
        weight: 1.0
  Eval:
    - MultiLabelLoss:
        weight: 1.0
Optimizer:
  name: Momentum
  momentum: 0.9
  lr:
    name: Cosine
    learning_rate: 0.1
  regularizer:
    name: 'L2'
    coeff: 0.00004
# data loader for train and eval
DataLoader:
  Train:
    dataset:
      name: MultiLabelDataset
      image_root: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/
      cls_label_path: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/multilabel_train_list.txt
      transform_ops:
        - DecodeImage:
            to_rgb: True
            channel_first: False
        - RandCropImage:
            size: 224
        - RandFlipImage:
            flip_code: 1
        - NormalizeImage:
            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''
    sampler:
      name: DistributedBatchSampler
      batch_size: 256
      drop_last: False
      shuffle: True
    loader:
      num_workers: 0
      use_shared_memory: True
  Eval:
    dataset: 
      name: MultiLabelDataset
      image_root: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/images/
      cls_label_path: ./dataset/NUS-WIDE-SCENE/NUS-SCENE-dataset/multilabel_test_list.txt
      transform_ops:
        - DecodeImage:
            to_rgb: True
            channel_first: False
        - ResizeImage:
            resize_short: 256
        - CropImage:
            size: 224
        - NormalizeImage:
            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''
    sampler:
      name: DistributedBatchSampler
      batch_size: 256
      drop_last: False
      shuffle: False
    loader:
      num_workers: 0
      use_shared_memory: True
Infer:
  infer_imgs: ./deploy/images/0517_2715693311.jpg
  batch_size: 10
  transforms:
    - DecodeImage:
        to_rgb: True
        channel_first: False
    - ResizeImage:
        resize_short: 256
    - CropImage:
        size: 224
    - NormalizeImage:
        scale: 1.0/255.0
        mean: [0.485, 0.456, 0.406]
        std: [0.229, 0.224, 0.225]
        order: ''
    - ToCHWImage:
  PostProcess:
    name: MultiLabelTopk
    topk: 5
    class_id_map_file: None
Metric:
  Train:
    - HammingDistance:
    - AccuracyScore:
  Eval:
    - HammingDistance:
    - AccuracyScore:


2.bug修复


PaddleClas/ppcls/data/dataloader/multilabel_dataset.py的 label_ration 的bug修复,具体如下。

from __future__ import print_function
import numpy as np
import os
import cv2
from ppcls.data.preprocess import transform
from ppcls.utils import logger
from .common_dataset import CommonDataset
class MultiLabelDataset(CommonDataset):
    def _load_anno(self, label_ratio=False):
        assert os.path.exists(self._cls_path)
        assert os.path.exists(self._img_root)
        self.images = []
        self.labels = []
        with open(self._cls_path) as fd:
            lines = fd.readlines()
            for l in lines:
                l = l.strip().split("\t")
                self.images.append(os.path.join(self._img_root, l[0]))
                labels = l[1].split(',')
                labels = [np.int64(i) for i in labels]
                self.labels.append(labels)
                assert os.path.exists(self.images[-1])
        #label_ration, 加赋值、判断
        self.label_ratio=label_ratio
        if label_ratio:
            return np.array(self.labels).mean(0).astype("float32")
    def __getitem__(self, idx):
        try:
            with open(self.images[idx], 'rb') as f:
                img = f.read()
            if self._transform_ops:
                img = transform(img, self._transform_ops)
            img = img.transpose((2, 0, 1))
            label = np.array(self.labels[idx]).astype("float32")
            # 这边判断依旧是,因为默认False,二用None判断会出错。
            # if self.label_ratio is not None:
            if self.label_ratio:
                return (img, np.array([label, self.label_ratio]))
            else:
                return (img, label)
        except Exception as ex:
            logger.error("Exception occured when parse line: {} with msg: {}".
                         format(self.images[idx], ex))
            rnd_idx = np.random.randint(self.__len__())
            return self.__getitem__(rnd_idx)
!cp multilabel_dataset.py PaddleClas/ppcls/data/dataloader/multilabel_dataset.py -rf


3.开始训练


%cd ~/PaddleClas/
!python3  tools/train.py \
        -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml

训练日志

[2022/06/20 12:03:24] ppcls INFO: [Train][Epoch 10/10][Avg]HammingDistance: 0.05218, AccuracyScore: 0.94782, MultiLabelLoss: 0.13593, loss: 0.13593
[2022/06/20 12:03:24] ppcls INFO: [Eval][Epoch 10][Iter: 0/69]MultiLabelLoss: 0.10388, loss: 0.10388, HammingDistance: 0.03741, AccuracyScore: 0.96259, batch_cost: 0.89574s, reader_cost: 0.81882, ips: 285.79827 images/sec
[2022/06/20 12:03:32] ppcls INFO: [Eval][Epoch 10][Iter: 10/69]MultiLabelLoss: 0.13586, loss: 0.13586, HammingDistance: 0.05322, AccuracyScore: 0.94678, batch_cost: 0.79895s, reader_cost: 0.72159, ips: 320.42090 images/sec
[2022/06/20 12:03:41] ppcls INFO: [Eval][Epoch 10][Iter: 20/69]MultiLabelLoss: 0.12612, loss: 0.12612, HammingDistance: 0.05114, AccuracyScore: 0.94886, batch_cost: 0.82050s, reader_cost: 0.74314, ips: 312.00325 images/sec
[2022/06/20 12:03:48] ppcls INFO: [Eval][Epoch 10][Iter: 30/69]MultiLabelLoss: 0.13136, loss: 0.13136, HammingDistance: 0.05045, AccuracyScore: 0.94955, batch_cost: 0.80221s, reader_cost: 0.72489, ips: 319.11937 images/sec
[2022/06/20 12:03:56] ppcls INFO: [Eval][Epoch 10][Iter: 40/69]MultiLabelLoss: 0.12647, loss: 0.12647, HammingDistance: 0.05075, AccuracyScore: 0.94925, batch_cost: 0.79200s, reader_cost: 0.71474, ips: 323.23343 images/sec
[2022/06/20 12:04:04] ppcls INFO: [Eval][Epoch 10][Iter: 50/69]MultiLabelLoss: 0.11195, loss: 0.11195, HammingDistance: 0.05035, AccuracyScore: 0.94965, batch_cost: 0.78639s, reader_cost: 0.70924, ips: 325.53638 images/sec
[2022/06/20 12:04:11] ppcls INFO: [Eval][Epoch 10][Iter: 60/69]MultiLabelLoss: 0.11939, loss: 0.11939, HammingDistance: 0.05029, AccuracyScore: 0.94971, batch_cost: 0.78139s, reader_cost: 0.70429, ips: 327.62312 images/sec
[2022/06/20 12:04:17] ppcls INFO: [Eval][Epoch 10][Avg]MultiLabelLoss: 0.12980, loss: 0.12980, HammingDistance: 0.05005, AccuracyScore: 0.94995

vdl图例

image.pngimage.png


五、模型评估


最终评估结果: [Eval][Epoch 0][Avg]MultiLabelLoss: 0.16364, loss: 0.16364, HammingDistance: 0.05834, AccuracyScore: 0.94166

!python3 tools/eval.py \
    -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
    -o Arch.pretrained="./output/MobileNetV1/best_model"


六、模型预测


通过预测,图像

image.png

最终预测结果: [{'class_ids': [6, 13, 23, 30], 'scores': [0.97452, 0.59816, 0.98675, 0.81546], 'file_name': './deploy/images/0517_2715693311.jpg', 'label_names': []}]

即:clouds、lake、sky、water

!python3 tools/infer.py \
    -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
    -o Arch.pretrained="./output/MobileNetV1/best_model"


七、基于预测引擎预测


1.导出 inference model


!python3 tools/export_model.py \
    -c ./ppcls/configs/quick_start/professional/MobileNetV1_multilabel.yaml \
    -o Arch.pretrained="./output/MobileNetV1/best_model"

inference model 的路径默认在当前路径下 ./inference

%cd ~/PaddleClas
!ls  ./inference -l
/home/aistudio/PaddleClas
total 13312
-rw-r--r-- 1 aistudio aistudio 13054335 Jun 20 14:22 inference.pdiparams
-rw-r--r-- 1 aistudio aistudio    12364 Jun 20 14:22 inference.pdiparams.info
-rw-r--r-- 1 aistudio aistudio   554665 Jun 20 14:22 inference.pdmodel


2 基于预测引擎预测


  • 首先进入 deploy 目录
  • 通过预测引擎推理预测

预测配置文件PaddleClas/deploy/configs/inference_cls_multilabel.yaml

Global:
  infer_imgs: "./images/0517_2715693311.jpg"
  inference_model_dir: "../inference/"
  batch_size: 1
  use_gpu: True
  enable_mkldnn: False
  cpu_num_threads: 10
  enable_benchmark: True
  use_fp16: False
  ir_optim: True
  use_tensorrt: False
  gpu_mem: 8000
  enable_profile: False
PreProcess:
  transform_ops:
    - ResizeImage:
        resize_short: 256
    - CropImage:
        size: 224
    - NormalizeImage:
        scale: 0.00392157
        mean: [0.485, 0.456, 0.406]
        std: [0.229, 0.224, 0.225]
        order: ''
        channel_num: 3
    - ToCHWImage:
PostProcess:
  main_indicator: MultiLabelTopk
  MultiLabelTopk:
    topk: 5
    class_id_map_file: None
  SavePreLabel:
    save_dir: ./pre_label/
%cd ~/PaddleClas/deploy
!python3 python/predict_cls.py \
     -c ./configs/inference_cls_multilabel.yaml
/home/aistudio/PaddleClas/deploy
2022-06-20 15:07:11 INFO: 
===========================================================
==        PaddleClas is powered by PaddlePaddle !        ==
===========================================================
==                                                       ==
==   For more info please go to the following website.   ==
==                                                       ==
==       https://github.com/PaddlePaddle/PaddleClas      ==
===========================================================
2022-06-20 15:07:11 INFO: Global : 
2022-06-20 15:07:11 INFO:     batch_size : 1
2022-06-20 15:07:11 INFO:     cpu_num_threads : 10
2022-06-20 15:07:11 INFO:     enable_benchmark : True
2022-06-20 15:07:11 INFO:     enable_mkldnn : False
2022-06-20 15:07:11 INFO:     enable_profile : False
2022-06-20 15:07:11 INFO:     gpu_mem : 8000
2022-06-20 15:07:11 INFO:     infer_imgs : ./images/0517_2715693311.jpg
2022-06-20 15:07:11 INFO:     inference_model_dir : ../inference/
2022-06-20 15:07:11 INFO:     ir_optim : True
2022-06-20 15:07:11 INFO:     use_fp16 : False
2022-06-20 15:07:11 INFO:     use_gpu : True
2022-06-20 15:07:11 INFO:     use_tensorrt : False
2022-06-20 15:07:11 INFO: PostProcess : 
2022-06-20 15:07:11 INFO:     MultiLabelTopk : 
2022-06-20 15:07:11 INFO:         class_id_map_file : None
2022-06-20 15:07:11 INFO:         topk : 5
2022-06-20 15:07:11 INFO:     SavePreLabel : 
2022-06-20 15:07:11 INFO:         save_dir : ./pre_label/
2022-06-20 15:07:11 INFO:     main_indicator : MultiLabelTopk
2022-06-20 15:07:11 INFO: PreProcess : 
2022-06-20 15:07:11 INFO:     transform_ops : 
2022-06-20 15:07:11 INFO:         ResizeImage : 
2022-06-20 15:07:11 INFO:             resize_short : 256
2022-06-20 15:07:11 INFO:         CropImage : 
2022-06-20 15:07:11 INFO:             size : 224
2022-06-20 15:07:11 INFO:         NormalizeImage : 
2022-06-20 15:07:11 INFO:             channel_num : 3
2022-06-20 15:07:11 INFO:             mean : [0.485, 0.456, 0.406]
2022-06-20 15:07:11 INFO:             order : 
2022-06-20 15:07:11 INFO:             scale : 0.00392157
2022-06-20 15:07:11 INFO:             std : [0.229, 0.224, 0.225]
2022-06-20 15:07:11 INFO:         ToCHWImage : None
0517_2715693311.jpg:  class id(s): [23], score(s): [0.62], label_name(s): []


八、总结


多标签图像分类在日常生活中很常见,例如年初的 天气以及时间分类比赛www.datafountain.cn/competition…等,飞桨提供了端到端全流程预测工具,极大缩减了训练成本呢。


相关实践学习
在云上部署ChatGLM2-6B大模型(GPU版)
ChatGLM2-6B是由智谱AI及清华KEG实验室于2023年6月发布的中英双语对话开源大模型。通过本实验,可以学习如何配置AIGC开发环境,如何部署ChatGLM2-6B大模型。
目录
相关文章
|
设计模式 XML JSON
二十三种设计模式全面解析-代理模式进阶篇:揭秘远程代理
二十三种设计模式全面解析-代理模式进阶篇:揭秘远程代理
412 0
|
数据挖掘 数据格式
跟着Cell学作图 | 6.时间序列分析(Mfuzz包)
这篇2020年发表在cell上关于新冠的组学文章里面有大量的生信内容。今天带大家复现其中的一个Supplemental Figure:时间序列分析图。
1483 0
跟着Cell学作图 | 6.时间序列分析(Mfuzz包)
|
机器学习/深度学习 人工智能 边缘计算
Paper Reading | 一种高效的光流估计方法——NeuFlow v2
本文介绍了一种高效的光流估计方法——NeuFlow v2。
Paper Reading | 一种高效的光流估计方法——NeuFlow v2
|
机器学习/深度学习 并行计算 数据可视化
目标分类笔记(二): 利用PaddleClas的框架来完成多标签分类任务(从数据准备到训练测试部署的完整流程)
这篇文章介绍了如何使用PaddleClas框架完成多标签分类任务,包括数据准备、环境搭建、模型训练、预测、评估等完整流程。
1275 0
目标分类笔记(二): 利用PaddleClas的框架来完成多标签分类任务(从数据准备到训练测试部署的完整流程)
|
安全 API Android开发
Android 15: 迈向64位时代的重大更新与全新体验
2024年,Google发布Android 15,迈向64位计算新时代。新系统淘汰32位应用,引入多项性能优化与新特性,如矢量emoji、预测性返回动画等,并预计随Pixel 9系列一同发布。开发者需更新应用确保兼容性,并利用新功能提升用户体验。
3847 15
Android 15: 迈向64位时代的重大更新与全新体验
|
Web App开发 移动开发 前端开发
React 视频播放器样式自定义实战指南
本文详细介绍了如何在React项目中实现视频播放器的样式自定义,涵盖HTML5 `<video>`标签的基础知识、CSS样式定制技巧及常见问题解决方案。针对全屏模式样式失效、移动端触摸事件冲突和进度条样式定制等问题提供了具体代码示例。同时,探讨了视频预加载策略和内存优化方法,并推荐了几款调试工具,帮助开发者提升用户体验和应用性能。
399 6
|
机器学习/深度学习 数据采集 人工智能
【自然语言处理(NLP)】基于LSTM实现谣言检测
【自然语言处理(NLP)】基于LSTM实现谣言检测,基于百度飞桨开发,参考于《机器学习实践》所作。
1647 1
【自然语言处理(NLP)】基于LSTM实现谣言检测
|
存储 网络协议 算法
|
人工智能 自然语言处理 安全
【claude官网入口】体验claude 3.5 Sonnet 的强大交互能力
Claude 是由 Anthropic 公司精心打造的人工智能系统,以其卓越的自然语言处理能力、深刻的上下文理解和优越的安全性而闻名于世
|
Linux 开发工具 数据安全/隐私保护
CentOS7报错:“xxx is not in the sudoers file. This incident will be reported“解决方法
CentOS7报错:“xxx is not in the sudoers file. This incident will be reported“解决方法
2360 1

热门文章

最新文章