【vision transformer】DETR原理及代码详解(三)

简介: 【vision transformer】DETR原理及代码详解

transformer 端到端目标检测DETR项目实践及代码详解:

Paddle Detr: git : PaddleViT/object_detection/DETR at develop · BR-IDL/PaddleViT · GitHub

数据集及backbone model 百度网盘资源:

COCO数据集:

解压后放在DETR/dataset/coco文件夹下。

d71ce23ca62b475b8deb3e1b21b51ca7.png

Resnet50_model_weights:

https://pan.baidu.com/share/init?surl=GX_znZUgKe4f6Fjhga3TnQ&pwd=uamk

Resnet101_model_weights:

https://pan.baidu.com/s/1kfimY0-ebmgs42phlhwUWA?pwd=2veg

放在DETR/config 文件夹下

441e94c6e27b42dca4548b1e5aef5e73.png

环境部署:

  • Python>=3.6
  • yaml>=0.2.5
  • PaddlePaddle>=2.1.0
  • yacs>=0.1.8

COCO数据集训练/测试:

单gpu 训练:

sh run_eval.sh

或者:

CUDA_VISIBLE_DEVICES=2 python main_single_gpu.py -cfg='./configs/detr_resnet50.yaml' -dataset='coco' -batch_size=2 -data_path='./dataset/coco'

若 error:

wsx@hello:~/0A_DATA/DETR$ python main_single_gpu.py -cfg='./configs/detr_resnet50.yaml' -dataset='coco' -batch_size=2 -data_path='./dataset/coco'
Traceback (most recent call last):
  File "main_single_gpu.py", line 412, in <module>
    main()
  File "main_single_gpu.py", line 234, in main
    config = update_config(config, arguments)
  File "/home/wsx/0A_DATA/DETR/config.py", line 123, in update_config
    _update_config_from_file(config, args.cfg)
  File "/home/wsx/0A_DATA/DETR/config.py", line 104, in _update_config_from_file
    yaml_cfg = yaml.load(infile, Loader=yaml.FullLoader)
AttributeError: module 'yaml' has no attribute 'FullLoader'

solution:

python -m pip install --ignore-installed PyYAML

 多GPU训练

sh run_train_multi.sh

或者

CUDA_VISIBLE_DEVICES=0,1,2,3 \
python main_multi_gpu.py \
    -cfg=./configs/detr_resnet50.yaml \
    -dataset=coco \
    -batch_size=2 \
    -data_path='./dataset/coco/'

单gpu 测试:

sh run_eval.sh

CUDA_VISIBLE_DEVICES=0 \
python main_single_gpu.py \
-cfg='./configs/detr_resnet50.yaml' \
-dataset='coco' \
-batch_size=2 \
-data_path='./dataset/coco' \
-eval \
-pretrained=./configs/detr_resnet50

多gpu 测试

sh run_eval_multi.sh

CUDA_VISIBLE_DEVICES=0,1,2,3 \
python main_multi_gpu.py \
    -cfg=./configs/detr_resnet50.yaml \
    -dataset=coco \
    -batch_size=4 \
    -data_path=/path/to/dataset/coco/val \
    -eval \
    -pretrained=/path/to/pretrained/model/detr_resnet50  # .pdparams is NOT needed

训练过程:

(wireframe) wsx@hello:~/0A_DATA/DETR$ python main_single_gpu.py -cfg='./configs/detr_resnet50.yaml' -dataset='coco' -batch_size=2 -data_path='./dataset/coco'
merging config from ./configs/detr_resnet50.yaml
0413 07:14:33 PM 
BASE: ['']
DATA:
  BATCH_SIZE: 2
  BATCH_SIZE_EVAL: 8
  DATASET: coco
  DATA_PATH: ./dataset/coco
  IMAGENET_MEAN: [0.485, 0.456, 0.406]
  IMAGENET_STD: [0.229, 0.224, 0.225]
  NUM_WORKERS: 2
EVAL: False
LOCAL_RANK: 0
MODEL:
  ATTENTION_DROPOUT: 0.0
  BACKBONE: resnet50
  BACKBONE_LR: 0.1
  DROPOUT: 0.1
  NAME: DETR
  NUM_CLASSES: 91
  NUM_QUERIES: 100
  PRETRAINED: None
  RESUME: None
  TRANS:
    EMBED_DIM: 256
    MLP_RATIO: 8.0
    NUM_DECODERS: 6
    NUM_ENCODERS: 6
    NUM_HEADS: 8
    PRE_NORM: False
    RETURN_INTERMEDIATE_DEC: True
  TYPE: DETR
NGPUS: -1
REPORT_FREQ: 10
SAVE: ./output/train-20220413-19-14-33
SAVE_FREQ: 1
SEED: 0
TAG: default
TRAIN:
  ACCUM_ITER: 1
  BASE_LR: 0.0001
  END_LR: 1e-05
  GRAD_CLIP: 0.1
  LAST_EPOCH: 0
  LR_SCHEDULER:
    DECAY_EPOCHS: 200
    DECAY_RATE: 0.1
    MILESTONES: 30, 60, 90
    NAME: step
  NUM_EPOCHS: 300
  OPTIMIZER:
    BETAS: (0.9, 0.999)
    EPS: 1e-08
    MOMENTUM: 0.9
    NAME: AdamW
  WARMUP_EPOCHS: 3
  WARMUP_START_LR: 1e-06
  WEIGHT_DECAY: 0.0001
VALIDATE_FREQ: 20
W0413 19:14:33.715436 27246 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 10.1, Runtime API Version: 10.1
W0413 19:14:33.717669 27246 device_context.cc:465] device: 0, cuDNN Version: 7.6.
0413 07:14:36 PM unique_endpoints {''}
0413 07:14:36 PM Downloading resnet50.pdparams from https://paddle-hapi.bj.bcebos.com/models/resnet50.pdparams
  0%|▌                                                                                                                                                      | 595/151272 [00:01<03:45, 669.42it/s]  0%|▋                                                                                                                                                      | 674/151272 [00:01<05:37, 446.26it/s]

测试 过程:

error1:

TypeError: 'numpy.float64' object cannot be interpreted as an integer
solution:
python -m pip install numpy==1.15.0

error2:

AssertionError: We only support 'to_tensor()' in dynamic graph mode, please call 'paddle.disable_static()' to enter dynamic graph mode.

解决方案1

import paddle 后面加入
paddle.enable_static()

又出错

TypeError: The registered buffer should be a core.VarBase, but received Variable.

回到上一步,删去paddle.enable_static(),再运行sh run_eval.sh,查看全部报错内容如下:

loading coco data, 48 imgs without annos are removed
0413 10:50:27 PM ----- Pretrained: Load model state from ./configs/detr_resnet50
0413 10:50:27 PM ----- Start Validating
/home/wsx/anaconda3/lib/python3.6/site-packages/pycocotools/cocoeval.py:507: DeprecationWarning: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.
  self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
/home/wsx/anaconda3/lib/python3.6/site-packages/pycocotools/cocoeval.py:508: DeprecationWarning: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.
  self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)
/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/fluid/dygraph/math_op_patch.py:253: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.bool, the right dtype will convert to paddle.float32
  format(lhs_dtype, rhs_dtype, lhs_dtype))
Traceback (most recent call last):
  File "main_single_gpu.py", line 413, in <module>
    main()
  File "main_single_gpu.py", line 355, in main
    debug_steps=config.REPORT_FREQ)
  File "main_single_gpu.py", line 219, in validate
    coco_evaluator.update(res)
  File "/home/wsx/0A_DATA/DETR/coco_eval.py", line 34, in update
    coco_dt = COCO.loadRes(self.coco_gt, results) if results else COCO()
  File "/home/wsx/anaconda3/lib/python3.6/site-packages/pycocotools/coco.py", line 308, in loadRes
    if type(resFile) == str or type(resFile) == unicode:
NameError: name 'unicode' is not defined
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/home/wsx/anaconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/wsx/anaconda3/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 213, in _thread_loop
    self._thread_done_event)
  File "/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/fluid/dataloader/fetcher.py", line 121, in fetch
    data.append(self.dataset[idx])
  File "/home/wsx/0A_DATA/DETR/coco.py", line 92, in __getitem__
    image, target = self._transforms(image, target)
  File "/home/wsx/0A_DATA/DETR/transforms.py", line 349, in __call__
    image, target = t(image, target)
  File "/home/wsx/0A_DATA/DETR/transforms.py", line 349, in __call__
    image, target = t(image, target)
  File "/home/wsx/0A_DATA/DETR/transforms.py", line 304, in __call__
    return T.to_tensor(image), target
  File "/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/vision/transforms/functional.py", line 82, in to_tensor
    return F_pil.to_tensor(pic, data_format)
  File "/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/vision/transforms/functional_pil.py", line 77, in to_tensor
    img = paddle.to_tensor(np.array(pic, copy=False))
  File "<decorator-gen-134>", line 2, in to_tensor
  File "/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/fluid/framework.py", line 228, in __impl__
    ), "We only support '%s()' in dynamic graph mode, please call 'paddle.disable_static()' to enter dynamic graph mode." % func.__name__
AssertionError: We only support 'to_tensor()' in dynamic graph mode, please call 'paddle.disable_static()' to enter dynamic graph mode.

针对

NameError: name 'unicode' is not defined

将下面文件中的

File "/home/wsx/anaconda3/lib/python3.6/site-packages/pycocotools/coco.py", line 308, in loadRes
    if type(resFile) == str or type(resFile) == unicode:

改为:

if type(resFile) == str:

因为 python2中的unicode()函数在python3中会报错。再运行,

测试结果如下:

(wireframe) wsx@hello:~/0A_DATA/DETR$ sh run_eval.sh
merging config from ./configs/detr_resnet50.yaml
0413 10:58:49 PM 
BASE: ['']
DATA:
  BATCH_SIZE: 2
  BATCH_SIZE_EVAL: 2
  DATASET: coco
  DATA_PATH: ./dataset/coco
  IMAGENET_MEAN: [0.485, 0.456, 0.406]
  IMAGENET_STD: [0.229, 0.224, 0.225]
  NUM_WORKERS: 2
EVAL: True
LOCAL_RANK: 0
MODEL:
  ATTENTION_DROPOUT: 0.0
  BACKBONE: resnet50
  BACKBONE_LR: 0.1
  DROPOUT: 0.1
  NAME: DETR
  NUM_CLASSES: 91
  NUM_QUERIES: 100
  PRETRAINED: ./configs/detr_resnet50
  RESUME: None
  TRANS:
    EMBED_DIM: 256
    MLP_RATIO: 8.0
    NUM_DECODERS: 6
    NUM_ENCODERS: 6
    NUM_HEADS: 8
    PRE_NORM: False
    RETURN_INTERMEDIATE_DEC: True
  TYPE: DETR
NGPUS: -1
REPORT_FREQ: 10
SAVE: ./output/eval-20220413-22-58-49
SAVE_FREQ: 1
SEED: 0
TAG: default
TRAIN:
  ACCUM_ITER: 1
  BASE_LR: 0.0001
  END_LR: 1e-05
  GRAD_CLIP: 0.1
  LAST_EPOCH: 0
  LR_SCHEDULER:
    DECAY_EPOCHS: 200
    DECAY_RATE: 0.1
    MILESTONES: 30, 60, 90
    NAME: step
  NUM_EPOCHS: 300
  OPTIMIZER:
    BETAS: (0.9, 0.999)
    EPS: 1e-08
    MOMENTUM: 0.9
    NAME: AdamW
  WARMUP_EPOCHS: 3
  WARMUP_START_LR: 1e-06
  WEIGHT_DECAY: 0.0001
VALIDATE_FREQ: 20
W0413 22:58:49.250936 26141 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 10.1, Runtime API Version: 10.1
W0413 22:58:49.253305 26141 device_context.cc:465] device: 0, cuDNN Version: 7.6.
0413 10:58:51 PM unique_endpoints {''}
0413 10:58:51 PM File /home/wsx/.cache/paddle/hapi/weights/resnet50.pdparams md5 checking...
0413 10:58:52 PM Found /home/wsx/.cache/paddle/hapi/weights/resnet50.pdparams
loading annotations into memory...
Done (t=0.34s)
creating index...
index created!
loading coco data, 48 imgs without annos are removed
0413 10:58:53 PM ----- Pretrained: Load model state from ./configs/detr_resnet50
0413 10:58:53 PM ----- Start Validating
/home/wsx/anaconda3/lib/python3.6/site-packages/pycocotools/cocoeval.py:507: DeprecationWarning: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.
  self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
/home/wsx/anaconda3/lib/python3.6/site-packages/pycocotools/cocoeval.py:508: DeprecationWarning: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.
  self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)
/home/wsx/anaconda3/lib/python3.6/site-packages/paddle/fluid/dygraph/math_op_patch.py:253: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.bool, the right dtype will convert to paddle.float32
  format(lhs_dtype, rhs_dtype, lhs_dtype))

单GPU(rtx2060)Validating 要很久,算了,不等了,回头服务器测。


相关实践学习
部署Stable Diffusion玩转AI绘画(GPU云服务器)
本实验通过在ECS上从零开始部署Stable Diffusion来进行AI绘画创作,开启AIGC盲盒。
目录
相关文章
Vision Transformer 图像分类识别 基于 ViT(Vision Transformer)的图像十分类 实战 完整代码 毕业设计
Vision Transformer 图像分类识别 基于 ViT(Vision Transformer)的图像十分类 实战 完整代码 毕业设计
136 0
Vision Transformer 图像分类识别 基于 ViT(Vision Transformer)的图像十分类 实战 完整代码 毕业设计
|
机器学习/深度学习 PyTorch 算法框架/工具
【Transformer系列(5)】Transformer代码超详细解读(Pytorch)
【Transformer系列(5)】Transformer代码超详细解读(Pytorch)
796 1
【Transformer系列(5)】Transformer代码超详细解读(Pytorch)
|
计算机视觉
论文阅读笔记 | Transformer系列——Transformer in Transformer
论文阅读笔记 | Transformer系列——Transformer in Transformer
308 0
论文阅读笔记 | Transformer系列——Transformer in Transformer
|
机器学习/深度学习 编解码 自然语言处理
论文阅读笔记 | Transformer系列——Swin Transformer
论文阅读笔记 | Transformer系列——Swin Transformer
1224 0
论文阅读笔记 | Transformer系列——Swin Transformer
|
机器学习/深度学习 并行计算 PyTorch
Swin Transformer实战:使用 Swin Transformer实现图像分类
目标检测刷到58.7 AP! 实例分割刷到51.1 Mask AP! 语义分割在ADE20K上刷到53.5 mIoU! 今年,微软亚洲研究院的Swin Transformer又开启了吊打CNN的模式,在速度和精度上都有很大的提高。这篇文章带你实现Swin Transformer图像分类。
9909 0
Swin Transformer实战:使用 Swin Transformer实现图像分类
|
SQL API
【vision transformer】DETR原理及代码详解(四)
【vision transformer】DETR原理及代码详解
567 0
|
机器学习/深度学习 算法 PyTorch
【vision transformer】DETR原理及代码详解(一)
【vision transformer】DETR原理及代码详解
1432 0
【vision transformer】DETR原理及代码详解(二)
【vision transformer】DETR原理及代码详解
103 0
|
机器学习/深度学习 编解码 人工智能
深度学习应用篇-计算机视觉-图像分类[3]:ResNeXt、Res2Net、Swin Transformer、Vision Transformer等模型结构、实现、模型特点详细介绍
深度学习应用篇-计算机视觉-图像分类[3]:ResNeXt、Res2Net、Swin Transformer、Vision Transformer等模型结构、实现、模型特点详细介绍
10704 1
 深度学习应用篇-计算机视觉-图像分类[3]:ResNeXt、Res2Net、Swin Transformer、Vision Transformer等模型结构、实现、模型特点详细介绍
|
机器学习/深度学习 算法 数据挖掘
【vision transformer】LETR论文解读及代码实战(一)
【vision transformer】LETR论文解读及代码实战
199 0