【DSW Gallery】EasyCV-基于关键点的视频分类示例

本文涉及的产品
交互式建模 PAI-DSW,每月250计算时 3个月
模型训练 PAI-DLC,100CU*H 3个月
模型在线服务 PAI-EAS,A10/V100等 500元 1个月
简介: EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文以基于关键点的视频分类为例,为您介绍如何在PAI-DSW中使用EasyCV。

直接使用

请打开EasyCV-基于关键点的视频分类示例,并点击右上角 “ 在DSW中打开” 。

image.png

EasyCV基于关键点的视频分类-STGCN

  人体骨骼关键点对于描述人体姿态,预测人体行为至关重要。因此人体骨骼关键点检测是诸多计算机视觉任务的基础,例如动作分类,异常行为检测,以及自动驾驶等等。近年来,随着深度学习技术的发展,人体骨骼关键点检测效果不断提升,已经开始广泛应用于计算机视觉的相关领域。具体应用主要集中在智能视频监控,病人监护系统,人机交互,虚拟现实,人体动画,智能家居,智能安防,运动员辅助训练等等。

  本文将介绍基于骨骼关键点的动作分类解决方案,端到端指导如何在pai-dsw基于EasyCV进行快速开发。

运行环境要求

PAI-Pytorch 1.7/1.8镜像, GPU机型:P100、V100、A100等。

安装依赖包

注:在PAI-DSW docker中无需安装相关依赖,可跳过此步骤1, 在本地notebook环境中执行1,2 步骤安装环境

1、获取torch和cuda版本,并根据版本号修改mmcv安装命令,安装对应版本的mmcv

import torch
import os
os.environ['CUDA']='cu' + torch.version.cuda.replace('.', '')
os.environ['Torch']='torch'+torch.version.__version__.replace('+PAI', '').split('+')[0]
!echo $CUDA
!echo $Torch
# install some python deps
! pip install --upgrade tqdm
! pip install mmcv-full==1.6.0 -f https://download.openmmlab.com/mmcv/dist/${CUDA}/${Torch}/index.html

2、安装EasyCV算法包

!pip install pai-easycv>=0.10.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

快速体验#

可以执行下面的代码,端到端快速体验模型效果。

首先将skeleton_based_demo.py这个脚本中的代码复制到本地并存储为skeleton_based_demo.py

!sudo apt-get update
!sudo apt-get install libgeos-dev
!pip install moviepy
!python skeleton_based_demo.py
Hit:1 http://mirrors.aliyun.com/ubuntu bionic InRelease
Hit:2 http://mirrors.aliyun.com/ubuntu bionic-security InRelease              
Hit:3 http://mirrors.aliyun.com/ubuntu bionic-updates InRelease               
Hit:4 http://mirrors.aliyun.com/ubuntu bionic-proposed InRelease
Hit:5 http://mirrors.aliyun.com/ubuntu bionic-backports InRelease
Reading package lists... Done
Reading package lists... Done
Building dependency tree       
Reading state information... Done
libgeos-dev is already the newest version (3.6.2-1build2).
0 upgraded, 0 newly installed, 0 to remove and 144 not upgraded.
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Requirement already satisfied: moviepy in /home/pai/lib/python3.6/site-packages (1.0.3)
Requirement already satisfied: decorator<5.0,>=4.0.2 in /home/pai/lib/python3.6/site-packages (from moviepy) (4.4.2)
Requirement already satisfied: tqdm<5.0,>=4.11.2 in /home/pai/lib/python3.6/site-packages (from moviepy) (4.64.1)
Requirement already satisfied: proglog<=1.0.0 in /home/pai/lib/python3.6/site-packages (from moviepy) (0.1.10)
Requirement already satisfied: numpy>=1.17.3 in /home/pai/lib/python3.6/site-packages (from moviepy) (1.19.5)
Requirement already satisfied: requests<3.0,>=2.8.1 in /home/pai/lib/python3.6/site-packages (from moviepy) (2.27.1)
Requirement already satisfied: imageio<3.0,>=2.5 in /home/pai/lib/python3.6/site-packages (from moviepy) (2.9.0)
Requirement already satisfied: imageio-ffmpeg>=0.2.0 in /home/pai/lib/python3.6/site-packages (from moviepy) (0.4.8)
Requirement already satisfied: pillow in /home/pai/lib/python3.6/site-packages (from imageio<3.0,>=2.5->moviepy) (8.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (3.3)
Requirement already satisfied: charset-normalizer~=2.0.0 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (2021.5.30)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (1.26.8)
Requirement already satisfied: importlib-resources in /home/pai/lib/python3.6/site-packages (from tqdm<5.0,>=4.11.2->moviepy) (5.4.0)
Requirement already satisfied: zipp>=3.1.0 in /home/pai/lib/python3.6/site-packages (from importlib-resources->tqdm<5.0,>=4.11.2->moviepy) (3.6.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[2023-03-06 14:06:35,521.521 dsw-229591-7c4869bc45-tvdsm:4953 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off.
Download video file from remote to local path "{cache_video_path}"...
100%|██████████████████████████████████████| 1.07M/1.07M [00:00<00:00, 5.08MB/s]
100%|████████████████████████████████████████| 243M/243M [00:22<00:00, 11.5MB/s]
load checkpoint from local path: /root/.cache/easycv/pose_hrnet_epoch_210_export.pt
100%|██████████████████████████████████████| 11.9M/11.9M [00:00<00:00, 16.4MB/s]
load checkpoint from local path: /root/.cache/easycv/stgcn_80e_ntu60_xsub.pth
reparam: 0
100%|██████████████████████████████████████| 34.5M/34.5M [00:02<00:00, 13.6MB/s]
load checkpoint from local path: /root/.cache/easycv/epoch_300.pt
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 3.4 task/s, elapsed: 0s, ETA:     0saction label: hugging other person                 ] 0/1, elapsed: 0s, ETA:[                                                  ] 0/1, elapsed: 0s, ETA:
Moviepy - Building video ./tmp/demo_show.mp4.
Moviepy - Writing video ./tmp/demo_show.mp4
Moviepy - Done !                                                                
Moviepy - video ready ./tmp/demo_show.mp4
Write video to ./tmp/demo_show.mp4 successfully!

视频结果可视化:

import cv2
from IPython.display import clear_output, Image, display
video_path = 'tmp/demo_show.mp4'
video = cv2.VideoCapture(video_path)
while True:
    try:
        clear_output(wait=True)
        # 读取视频
        ret, frame = video.read()
        if not ret:
            break
        _, ret = cv2.imencode('.jpg', frame)
        display(Image(data=ret))
    except KeyboardInterrupt:
        video.release()

image.png

开发流程

检测模型开发

我们直接使用准备好的模型进行演示,如果想要重新训练检测模型,请参考案例:https://pai.console.aliyun.com/?regionId=cn-hangzhou#/dsw-gallery/preview/deepLearning/cv/easycv_detection_YOLOX

下载模型

!mkdir pretrained_models
!wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/detection/yolox/yolox_s_bs16_lr002/epoch_300.pt -O pretrained_models/yolox_s.pt
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file.
ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled.
--2023-03-06 14:09:56--  http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/detection/yolox/yolox_s_bs16_lr002/epoch_300.pt
Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13
Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 36133977 (34M) [application/octet-stream]
Saving to: ‘pretrained_models/yolox_s.pt’
pretrained_models/y 100%[===================>]  34.46M  13.2MB/s    in 2.6s    
2023-03-06 14:09:58 (13.2 MB/s) - ‘pretrained_models/yolox_s.pt’ saved [36133977/36133977]

模型推理&可视化(可选)#

获取推理结果:

from easycv.predictors import YoloXPredictor
det_predictor = YoloXPredictor(model_path='pretrained_models/yolox_s.pt', score_thresh=0.9)
img = 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/images/two_person.jpg'
results = det_predictor(img)[0]
print(f'Detection results: {results}')
[2023-03-06 14:11:09,474.474 dsw-229591-7c4869bc45-tvdsm:4613 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off.
reparam: 0
load checkpoint from local path: pretrained_models/yolox_s.pt
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 0.7 task/s, elapsed: 1s, ETA:     0sDetection results: {'detection_boxes': array([[393.29456 , 106.736565, 514.5323  , 381.98608 ],
       [300.8314  , 125.51828 , 398.19537 , 411.52783 ]], dtype=float32), 'detection_scores': array([0.9170087, 0.9002314], dtype=float32), 'detection_classes': array([0, 0], dtype=int32), 'img_metas': {'filename': 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/images/two_person.jpg', 'ori_img_shape': (480, 853, 3), 'img_shape': (360, 640, 3), 'scale_factor': array([0.7502931, 0.75     , 0.7502931, 0.75     ], dtype=float32), 'pad': 0.0, 'img_norm_cfg': {'mean': array([123.675, 116.28 , 103.53 ], dtype=float32), 'std': array([58.395, 57.12 , 57.375], dtype=float32), 'to_rgb': True}}, 'detection_class_names': ['person', 'person'], 'ori_img_shape': [480, 853]}

推理结果可视化:

import cv2
from IPython.display import clear_output, Image, display
from easycv.file.image import load_image
img_np = load_image(img)
detection_boxes = results['detection_boxes']
for box in detection_boxes:
    left_top = (int(box[0]), int(box[1]))
    right_bottom = (int(box[2]), int(box[3]))
    cv2.rectangle(img_np, left_top, right_bottom, (0, 255, 0), thickness=1)
_, ret = cv2.imencode('.jpg', img_np)
display(Image(data=ret))

image.png

关键点模型开发

我们直接使用准备好的模型演示,如果想要重新训练关键点模型,请参考案例:https://pai.console.aliyun.com/?regionId=cn-hangzhou#/dsw-gallery/preview/deepLearning/cv/easycv_pose_topdown_hrnet

下载模型

!wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/pose/top_down_hrnet/pose_hrnet_epoch_210_export.pt -O pretrained_models/pose_hrnet.pt
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file.
ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled.
--2023-03-06 14:12:14--  http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/pose/top_down_hrnet/pose_hrnet_epoch_210_export.pt
Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13
Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 255086694 (243M) [application/octet-stream]
Saving to: ‘pretrained_models/pose_hrnet.pt’
pretrained_models/p 100%[===================>] 243.27M  11.8MB/s    in 20s     
2023-03-06 14:12:34 (12.1 MB/s) - ‘pretrained_models/pose_hrnet.pt’ saved [255086694/255086694]

模型推理&可视化(可选)

获取推理结果:

from easycv.predictors import PoseTopDownPredictor
pose_predictor = PoseTopDownPredictor(
    model_path='pretrained_models/pose_hrnet.pt',
    detection_predictor_config=dict(
        type='YoloXPredictor',
        model_path='pretrained_models/yolox_s.pt',
    ),
    bbox_thr=0.9,
    cat_id=0,  # person category id
)
img = 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/images/two_person.jpg'
results = pose_predictor(img)[0]
print(f'Pose(Keypoints) results: {results}')
load checkpoint from local path: pretrained_models/pose_hrnet.pt
reparam: 0
load checkpoint from local path: pretrained_models/yolox_s.pt
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 3.1 task/s, elapsed: 0s, ETA:     0s                                               ] 0/1, elapsed: 0s, ETA:Pose(Keypoints) results: {'keypoints': array([[[446.52234   , 139.24068   ,   0.9669522 ],
        [452.1864    , 134.07887   ,   0.9656511 ],
        [445.32617   , 133.5937    ,   0.9783832 ],
        [469.97046   , 136.4598    ,   0.96018004],
        [448.53107   , 135.00523   ,   0.6578802 ],
        [484.84204   , 174.96892   ,   0.9387574 ],
        [437.69775   , 167.76198   ,   0.91668504],
        [497.69324   , 218.9604    ,   0.90464604],
        [419.41394   , 205.68068   ,   0.8883008 ],
        [500.0896    , 258.17426   ,   0.94137996],
        [403.52203   , 234.57295   ,   0.8803861 ],
        [464.09875   , 259.55505   ,   0.8396832 ],
        [431.8305    , 256.75336   ,   0.8154782 ],
        [461.54712   , 318.659     ,   0.92906725],
        [430.77502   , 316.80615   ,   0.9279456 ],
        [460.7788    , 363.59674   ,   0.89747685],
        [428.1001    , 363.88953   ,   0.8918172 ]],
       [[358.0172    , 155.05956   ,   0.97038084],
        [355.98962   , 149.00755   ,   0.92078936],
        [352.4453    , 149.90215   ,   0.97667825],
        [340.4132    , 152.0797    ,   0.606572  ],
        [334.53314   , 154.22981   ,   0.9548439 ],
        [348.04376   , 183.88353   ,   0.90829027],
        [322.1966    , 191.77852   ,   0.9267349 ],
        [351.15045   , 232.45781   ,   0.89447075],
        [330.02106   , 242.5197    ,   0.90907884],
        [369.1089    , 261.89404   ,   0.8053842 ],
        [350.1405    , 280.6457    ,   0.82891107],
        [353.85364   , 267.27197   ,   0.7821094 ],
        [330.16003   , 274.84985   ,   0.8436185 ],
        [360.4154    , 323.65887   ,   0.88485605],
        [320.79358   , 332.54346   ,   0.9094477 ],
        [370.22504   , 369.16022   ,   0.941416  ],
        [314.28442   , 381.05402   ,   0.9005176 ]]], dtype=float32), 'bbox': array([[350.60413 , 106.105804, 557.2877  , 381.6839  ,   1.      ],
       [242.01334 , 124.487015, 457.23285 , 411.44635 ,   1.      ]],
      dtype=float32)}

推理结果可视化:

from IPython.display import clear_output, Image, display
from easycv.file.image import load_image
img_np = load_image(img)
show_img = pose_predictor.show_result(
    img_np,
    results,
)
_, ret = cv2.imencode('.jpg', show_img)
display(Image(data=ret))

image.png

视频分类模型开发

数据准备

我们提供了小型视频关键点的数据集供测试。

!mkdir data
# 下载训练集
!wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_train_3000_samples.pkl -O data/train.pkl
# 下载测试集
!wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_val_600_samples.pkl -O data/val.pkl
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file.
ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled.
--2023-03-06 14:13:44--  http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_train_3000_samples.pkl
Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13
Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 36143216 (34M) [application/octet-stream]
Saving to: ‘data/train.pkl’
data/train.pkl      100%[===================>]  34.47M  10.8MB/s    in 3.2s    
2023-03-06 14:13:47 (10.8 MB/s) - ‘data/train.pkl’ saved [36143216/36143216]
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file.
ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled.
--2023-03-06 14:13:48--  http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_val_600_samples.pkl
Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13
Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 7748162 (7.4M) [application/octet-stream]
Saving to: ‘data/val.pkl’
data/val.pkl        100%[===================>]   7.39M  14.5MB/s    in 0.5s    
2023-03-06 14:13:48 (14.5 MB/s) - ‘data/val.pkl’ saved [7748162/7748162]

训练模型

为了快速验证功能,这里我们将epoch数量和learning rate都调小,并加载预训练模型,方便快速生成结果。 如果有自定义数据的需求,相关参数还需要进行调整。

!python -m easycv.tools.train \
configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \
--work_dir work_dir/ \
--load_from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth \
--user_config_params ann_file_train=data/train.pkl ann_file_val=data/val.pkl optimizer.lr=0.00001 log_config.interval=20 total_epochs=1
[2023-03-06 14:24:06,509.509 dsw-229591-7c4869bc45-tvdsm:5999 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off.
/home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:37: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting OMP_NUM_THREADS environment variable for each process '
/home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:47: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting MKL_NUM_THREADS environment variable for each process '
2023-03-06 14:24:09,964 - easycv - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.6.12 |Anaconda, Inc.| (default, Sep  8 2020, 23:10:56) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GPU 0: Tesla V100-SXM2-32GB
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.2+PAI
PyTorch compiling details: PyTorch built with:
  - GCC 7.5
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash N/A)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_75,code=compute_75
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.5, CXX_COMPILER=/usr/lib/ccache/c++, CXX_FLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 
TorchVision: 0.9.2+cu101
OpenCV: 4.4.0
MMCV: 1.6.0
EasyCV: 0.10.0
------------------------------------------------------------
2023-03-06 14:24:09,965 - easycv - INFO - Distributed training: False
2023-03-06 14:24:09,965 - easycv - INFO - Config:
/home/pai/lib/python3.6/site-packages/easycv/configs/base.py
train_cfg = {}
test_cfg = {}
optimizer_config = dict()  # grad_clip, coalesce, bucket_size_mb
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
# runtime settings
dist_params = dict(backend='nccl')
cudnn_benchmark = False
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
/home/pai/lib/python3.6/site-packages/easycv/configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py
_base_ = 'configs/base.py'
CLASSES = [
    'drink water', 'eat meal/snack', 'brushing teeth', 'brushing hair', 'drop',
    'pickup', 'throw', 'sitting down', 'standing up (from sitting position)',
    'clapping', 'reading', 'writing', 'tear up paper', 'wear jacket',
    'take off jacket', 'wear a shoe', 'take off a shoe', 'wear on glasses',
    'take off glasses', 'put on a hat/cap', 'take off a hat/cap', 'cheer up',
    'hand waving', 'kicking something', 'reach into pocket',
    'hopping (one foot jumping)', 'jump up', 'make a phone call/answer phone',
    'playing with phone/tablet', 'typing on a keyboard',
    'pointing to something with finger', 'taking a selfie',
    'check time (from watch)', 'rub two hands together', 'nod head/bow',
    'shake head', 'wipe face', 'salute', 'put the palms together',
    'cross hands in front (say stop)', 'sneeze/cough', 'staggering', 'falling',
    'touch head (headache)', 'touch chest (stomachache/heart pain)',
    'touch back (backache)', 'touch neck (neckache)',
    'nausea or vomiting condition',
    'use a fan (with hand or paper)/feeling warm',
    'punching/slapping other person', 'kicking other person',
    'pushing other person', 'pat on back of other person',
    'point finger at the other person', 'hugging other person',
    'giving something to other person', "touch other person's pocket",
    'handshaking', 'walking towards each other',
    'walking apart from each other'
]
model = dict(
    type='SkeletonGCN',
    backbone=dict(
        type='STGCN',
        in_channels=3,
        edge_importance_weighting=True,
        graph_cfg=dict(layout='coco', strategy='spatial')),
    cls_head=dict(
        type='STGCNHead',
        num_classes=60,
        in_channels=256,
        loss_cls=dict(type='CrossEntropyLoss')),
    train_cfg=None,
    test_cfg=None)
dataset_type = 'VideoDataset'
ann_file_train = 'data/posec3d/ntu60_xsub_train.pkl'
ann_file_val = 'data/posec3d/ntu60_xsub_val.pkl'
train_pipeline = [
    dict(type='PaddingWithLoop', clip_len=300),
    dict(type='PoseDecode'),
    dict(type='FormatGCNInput', input_format='NCTVM'),
    dict(type='PoseNormalize'),
    dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]),
    dict(type='VideoToTensor', keys=['keypoint'])
]
val_pipeline = [
    dict(type='PaddingWithLoop', clip_len=300),
    dict(type='PoseDecode'),
    dict(type='FormatGCNInput', input_format='NCTVM'),
    dict(type='PoseNormalize'),
    dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]),
    dict(type='VideoToTensor', keys=['keypoint'])
]
test_pipeline = [
    dict(type='PaddingWithLoop', clip_len=300),
    dict(type='PoseDecode'),
    dict(type='FormatGCNInput', input_format='NCTVM'),
    dict(type='PoseNormalize'),
    dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]),
    dict(type='VideoToTensor', keys=['keypoint'])
]
data = dict(
    imgs_per_gpu=16,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        data_source=dict(
            type='PoseDataSourceForVideoRec',
            ann_file=ann_file_train,
            data_prefix='',
        ),
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        imgs_per_gpu=1,
        data_source=dict(
            type='PoseDataSourceForVideoRec',
            ann_file=ann_file_val,
            data_prefix='',
        ),
        pipeline=val_pipeline),
    test=dict(
        type=dataset_type,
        data_source=dict(
            type='PoseDataSourceForVideoRec',
            ann_file=ann_file_val,
            data_prefix='',
        ),
        pipeline=test_pipeline))
# optimizer
optimizer = dict(
    type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001, nesterov=True)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[10, 50])
total_epochs = 80
# eval
eval_config = dict(initial=False, interval=1, gpu_collect=True)
eval_pipelines = [
    dict(
        mode='test',
        data=data['val'],
        dist_eval=True,
        evaluators=[dict(type='ClsEvaluator', topk=(1, 5))],
    )
]
log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')])
checkpoint_config = dict(interval=1)
export = dict(type='raw')
# export = dict(type='jit')
# export = dict(
#     type='blade',
#     blade_config=dict(
#         enable_fp16=True,
#         fp16_fallback_op_ratio=0.0,
#         customize_op_black_list=[
#             'aten::select', 'aten::index', 'aten::slice', 'aten::view',
#             'aten::upsample', 'aten::clamp', 'aten::clone'
#         ]))
2023-03-06 14:24:09,965 - easycv - INFO - Config Dict:
{"train_cfg": {}, "test_cfg": {}, "optimizer_config": {"grad_clip": null}, "log_config": {"interval": 20, "hooks": [{"type": "TextLoggerHook"}]}, "dist_params": {"backend": "nccl"}, "cudnn_benchmark": false, "log_level": "INFO", "load_from": "http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth", "resume_from": null, "workflow": [["train", 1]], "CLASSES": ["drink water", "eat meal/snack", "brushing teeth", "brushing hair", "drop", "pickup", "throw", "sitting down", "standing up (from sitting position)", "clapping", "reading", "writing", "tear up paper", "wear jacket", "take off jacket", "wear a shoe", "take off a shoe", "wear on glasses", "take off glasses", "put on a hat/cap", "take off a hat/cap", "cheer up", "hand waving", "kicking something", "reach into pocket", "hopping (one foot jumping)", "jump up", "make a phone call/answer phone", "playing with phone/tablet", "typing on a keyboard", "pointing to something with finger", "taking a selfie", "check time (from watch)", "rub two hands together", "nod head/bow", "shake head", "wipe face", "salute", "put the palms together", "cross hands in front (say stop)", "sneeze/cough", "staggering", "falling", "touch head (headache)", "touch chest (stomachache/heart pain)", "touch back (backache)", "touch neck (neckache)", "nausea or vomiting condition", "use a fan (with hand or paper)/feeling warm", "punching/slapping other person", "kicking other person", "pushing other person", "pat on back of other person", "point finger at the other person", "hugging other person", "giving something to other person", "touch other person's pocket", "handshaking", "walking towards each other", "walking apart from each other"], "model": {"type": "SkeletonGCN", "backbone": {"type": "STGCN", "in_channels": 3, "edge_importance_weighting": true, "graph_cfg": {"layout": "coco", "strategy": "spatial"}}, "cls_head": {"type": "STGCNHead", "num_classes": 60, "in_channels": 256, "loss_cls": {"type": "CrossEntropyLoss"}}, "train_cfg": null, "test_cfg": null}, "dataset_type": "VideoDataset", "ann_file_train": "data/train.pkl", "ann_file_val": "data/val.pkl", "train_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "val_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "test_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "data": {"imgs_per_gpu": 16, "workers_per_gpu": 2, "train": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/train.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "val": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "test": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}}, "optimizer": {"type": "SGD", "lr": 1e-05, "momentum": 0.9, "weight_decay": 0.0001, "nesterov": true}, "lr_config": {"policy": "step", "step": [10, 50]}, "total_epochs": 1, "eval_config": {"initial": false, "interval": 1, "gpu_collect": true}, "eval_pipelines": [{"mode": "test", "data": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "dist_eval": true, "evaluators": [{"type": "ClsEvaluator", "topk": [1, 5]}]}], "checkpoint_config": {"interval": 1}, "export": {"type": "raw"}, "work_dir": "work_dir/", "oss_work_dir": null, "gpus": 1}
2023-03-06 14:24:09,966 - easycv - INFO - GPU INFO : Tesla V100-SXM2-32GB
2023-03-06 14:24:09,967 - easycv - INFO - Set random seed to 1942453656, deterministic: False
/home/pai/lib/python3.6/site-packages/easycv/models/loss/cross_entropy_loss.py:273: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``.
  'Default ``avg_non_ignore`` is False, if you would like to '
SkeletonGCN(
  (backbone): STGCN(
    (data_bn): BatchNorm1d(51, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (st_gcn_networks): ModuleList(
      (0): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(3, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (1): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (2): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (3): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (4): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0))
          (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (residual): Sequential(
          (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 1))
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (relu): ReLU(inplace=True)
      )
      (5): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (6): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (7): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(128, 768, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0))
          (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (residual): Sequential(
          (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 1))
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (relu): ReLU(inplace=True)
      )
      (8): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (9): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
    )
    (edge_importance): ParameterList(
        (0): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (1): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (2): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (3): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (4): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (5): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (6): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (7): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (8): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (9): Parameter containing: [torch.FloatTensor of size 3x17x17]
    )
  )
  (cls_head): STGCNHead(
    (loss_cls): CrossEntropyLoss(avg_non_ignore=False)
    (pool): AdaptiveAvgPool2d(output_size=(1, 1))
    (fc): Conv2d(256, 60, kernel_size=(1, 1), stride=(1, 1))
  )
)
data shuffle: True
2023-03-06 14:24:10,259 - easycv - INFO - 3000 videos remain after valid thresholding
GPU INFO :  Tesla V100-SXM2-32GB
2023-03-06 14:24:12,039 - easycv - INFO - open validate hook
2023-03-06 14:24:12,076 - easycv - INFO - 600 videos remain after valid thresholding
2023-03-06 14:24:12,077 - easycv - INFO - register EvaluationHook {'initial': False, 'evaluators': [<easycv.core.evaluation.classification_eval.ClsEvaluator object at 0x7fba8351f438>]}
2023-03-06 14:24:12,077 - easycv - INFO - load checkpoint from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth
load checkpoint from local path: /root/.cache/easycv/stgcn_80e_ntu60_xsub.pth
2023-03-06 14:24:12,106 - easycv - INFO - Start running, host: root@dsw-229591-7c4869bc45-tvdsm, work_dir: /mnt/workspace/work_dir
2023-03-06 14:24:12,106 - easycv - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) StepLrUpdaterHook                  
(NORMAL      ) CheckpointHook                     
(NORMAL      ) EvalHook                           
(NORMAL      ) BestCkptSaverHook                  
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_epoch:
(VERY_HIGH   ) StepLrUpdaterHook                  
(LOW         ) IterTimerHook                      
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_iter:
(VERY_HIGH   ) StepLrUpdaterHook                  
(LOW         ) IterTimerHook                      
 -------------------- 
after_train_iter:
(ABOVE_NORMAL) OptimizerHook                      
(NORMAL      ) CheckpointHook                     
(LOW         ) IterTimerHook                      
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
after_train_epoch:
(NORMAL      ) CheckpointHook                     
(NORMAL      ) EvalHook                           
(NORMAL      ) BestCkptSaverHook                  
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_val_epoch:
(LOW         ) IterTimerHook                      
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_val_iter:
(LOW         ) IterTimerHook                      
 -------------------- 
after_val_iter:
(LOW         ) IterTimerHook                      
 -------------------- 
after_val_epoch:
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
after_run:
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
2023-03-06 14:24:12,108 - easycv - INFO - workflow: [('train', 1)], max: 1 epochs
2023-03-06 14:24:12,108 - easycv - INFO - Checkpoints will be saved to /mnt/workspace/work_dir by HardDiskBackend.
Cannot get the env variable of GPU_STATUS_FILE, no data report to scheduler. This is not an error. It is because the scheduler of the cluster did not enable this feature.
2023-03-06 14:24:16,489 - easycv - INFO - Epoch [1][20/187] lr: 1.000e-05, eta: 0:00:35, time: 0.215, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0031, loss: 0.0031
2023-03-06 14:24:19,366 - easycv - INFO - Epoch [1][40/187] lr: 1.000e-05, eta: 0:00:26, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0025, loss: 0.0025
2023-03-06 14:24:22,245 - easycv - INFO - Epoch [1][60/187] lr: 1.000e-05, eta: 0:00:21, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042
2023-03-06 14:24:25,124 - easycv - INFO - Epoch [1][80/187] lr: 1.000e-05, eta: 0:00:17, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0034, loss: 0.0034
2023-03-06 14:24:28,002 - easycv - INFO - Epoch [1][100/187]  lr: 1.000e-05, eta: 0:00:13, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0035, loss: 0.0035
2023-03-06 14:24:30,880 - easycv - INFO - Epoch [1][120/187]  lr: 1.000e-05, eta: 0:00:10, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0037, loss: 0.0037
2023-03-06 14:24:33,760 - easycv - INFO - Epoch [1][140/187]  lr: 1.000e-05, eta: 0:00:07, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042
2023-03-06 14:24:36,643 - easycv - INFO - Epoch [1][160/187]  lr: 1.000e-05, eta: 0:00:04, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027
2023-03-06 14:24:39,527 - easycv - INFO - Epoch [1][180/187]  lr: 1.000e-05, eta: 0:00:01, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027
2023-03-06 14:24:40,490 - easycv - INFO - Saving checkpoint at 1 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 600/600, 134.7 task/s, elapsed: 4s, ETA:     0s
2023-03-06 14:24:45,105 - easycv - INFO - SaveBest metric_name: ['ClsEvaluator_neck_top1']
2023-03-06 14:24:45,106 - easycv - INFO - End SaveBest metric
2023-03-06 14:24:45,106 - easycv - INFO - Epoch(val) [1][187] prob_top1: 87.0000, prob_top5: 98.0000

训练模型

为了快速验证功能,这里我们将epoch数量和learning rate都调小,并加载预训练模型,方便快速生成结果。 如果有自定义数据的需求,相关参数还需要进行调整。

!python -m easycv.tools.train \
configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \
--work_dir work_dir/ \
--load_from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth \
--user_config_params ann_file_train=data/train.pkl ann_file_val=data/val.pkl optimizer.lr=0.00001 log_config.interval=20 total_epochs=1
[2023-03-06 14:24:06,509.509 dsw-229591-7c4869bc45-tvdsm:5999 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off.
/home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:37: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting OMP_NUM_THREADS environment variable for each process '
/home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:47: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting MKL_NUM_THREADS environment variable for each process '
2023-03-06 14:24:09,964 - easycv - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.6.12 |Anaconda, Inc.| (default, Sep  8 2020, 23:10:56) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GPU 0: Tesla V100-SXM2-32GB
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.2+PAI
PyTorch compiling details: PyTorch built with:
  - GCC 7.5
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash N/A)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_75,code=compute_75
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.5, CXX_COMPILER=/usr/lib/ccache/c++, CXX_FLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 
TorchVision: 0.9.2+cu101
OpenCV: 4.4.0
MMCV: 1.6.0
EasyCV: 0.10.0
------------------------------------------------------------
2023-03-06 14:24:09,965 - easycv - INFO - Distributed training: False
2023-03-06 14:24:09,965 - easycv - INFO - Config:
/home/pai/lib/python3.6/site-packages/easycv/configs/base.py
train_cfg = {}
test_cfg = {}
optimizer_config = dict()  # grad_clip, coalesce, bucket_size_mb
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
# runtime settings
dist_params = dict(backend='nccl')
cudnn_benchmark = False
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
/home/pai/lib/python3.6/site-packages/easycv/configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py
_base_ = 'configs/base.py'
CLASSES = [
    'drink water', 'eat meal/snack', 'brushing teeth', 'brushing hair', 'drop',
    'pickup', 'throw', 'sitting down', 'standing up (from sitting position)',
    'clapping', 'reading', 'writing', 'tear up paper', 'wear jacket',
    'take off jacket', 'wear a shoe', 'take off a shoe', 'wear on glasses',
    'take off glasses', 'put on a hat/cap', 'take off a hat/cap', 'cheer up',
    'hand waving', 'kicking something', 'reach into pocket',
    'hopping (one foot jumping)', 'jump up', 'make a phone call/answer phone',
    'playing with phone/tablet', 'typing on a keyboard',
    'pointing to something with finger', 'taking a selfie',
    'check time (from watch)', 'rub two hands together', 'nod head/bow',
    'shake head', 'wipe face', 'salute', 'put the palms together',
    'cross hands in front (say stop)', 'sneeze/cough', 'staggering', 'falling',
    'touch head (headache)', 'touch chest (stomachache/heart pain)',
    'touch back (backache)', 'touch neck (neckache)',
    'nausea or vomiting condition',
    'use a fan (with hand or paper)/feeling warm',
    'punching/slapping other person', 'kicking other person',
    'pushing other person', 'pat on back of other person',
    'point finger at the other person', 'hugging other person',
    'giving something to other person', "touch other person's pocket",
    'handshaking', 'walking towards each other',
    'walking apart from each other'
]
model = dict(
    type='SkeletonGCN',
    backbone=dict(
        type='STGCN',
        in_channels=3,
        edge_importance_weighting=True,
        graph_cfg=dict(layout='coco', strategy='spatial')),
    cls_head=dict(
        type='STGCNHead',
        num_classes=60,
        in_channels=256,
        loss_cls=dict(type='CrossEntropyLoss')),
    train_cfg=None,
    test_cfg=None)
dataset_type = 'VideoDataset'
ann_file_train = 'data/posec3d/ntu60_xsub_train.pkl'
ann_file_val = 'data/posec3d/ntu60_xsub_val.pkl'
train_pipeline = [
    dict(type='PaddingWithLoop', clip_len=300),
    dict(type='PoseDecode'),
    dict(type='FormatGCNInput', input_format='NCTVM'),
    dict(type='PoseNormalize'),
    dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]),
    dict(type='VideoToTensor', keys=['keypoint'])
]
val_pipeline = [
    dict(type='PaddingWithLoop', clip_len=300),
    dict(type='PoseDecode'),
    dict(type='FormatGCNInput', input_format='NCTVM'),
    dict(type='PoseNormalize'),
    dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]),
    dict(type='VideoToTensor', keys=['keypoint'])
]
test_pipeline = [
    dict(type='PaddingWithLoop', clip_len=300),
    dict(type='PoseDecode'),
    dict(type='FormatGCNInput', input_format='NCTVM'),
    dict(type='PoseNormalize'),
    dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]),
    dict(type='VideoToTensor', keys=['keypoint'])
]
data = dict(
    imgs_per_gpu=16,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        data_source=dict(
            type='PoseDataSourceForVideoRec',
            ann_file=ann_file_train,
            data_prefix='',
        ),
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        imgs_per_gpu=1,
        data_source=dict(
            type='PoseDataSourceForVideoRec',
            ann_file=ann_file_val,
            data_prefix='',
        ),
        pipeline=val_pipeline),
    test=dict(
        type=dataset_type,
        data_source=dict(
            type='PoseDataSourceForVideoRec',
            ann_file=ann_file_val,
            data_prefix='',
        ),
        pipeline=test_pipeline))
# optimizer
optimizer = dict(
    type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001, nesterov=True)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[10, 50])
total_epochs = 80
# eval
eval_config = dict(initial=False, interval=1, gpu_collect=True)
eval_pipelines = [
    dict(
        mode='test',
        data=data['val'],
        dist_eval=True,
        evaluators=[dict(type='ClsEvaluator', topk=(1, 5))],
    )
]
log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')])
checkpoint_config = dict(interval=1)
export = dict(type='raw')
# export = dict(type='jit')
# export = dict(
#     type='blade',
#     blade_config=dict(
#         enable_fp16=True,
#         fp16_fallback_op_ratio=0.0,
#         customize_op_black_list=[
#             'aten::select', 'aten::index', 'aten::slice', 'aten::view',
#             'aten::upsample', 'aten::clamp', 'aten::clone'
#         ]))
2023-03-06 14:24:09,965 - easycv - INFO - Config Dict:
{"train_cfg": {}, "test_cfg": {}, "optimizer_config": {"grad_clip": null}, "log_config": {"interval": 20, "hooks": [{"type": "TextLoggerHook"}]}, "dist_params": {"backend": "nccl"}, "cudnn_benchmark": false, "log_level": "INFO", "load_from": "http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth", "resume_from": null, "workflow": [["train", 1]], "CLASSES": ["drink water", "eat meal/snack", "brushing teeth", "brushing hair", "drop", "pickup", "throw", "sitting down", "standing up (from sitting position)", "clapping", "reading", "writing", "tear up paper", "wear jacket", "take off jacket", "wear a shoe", "take off a shoe", "wear on glasses", "take off glasses", "put on a hat/cap", "take off a hat/cap", "cheer up", "hand waving", "kicking something", "reach into pocket", "hopping (one foot jumping)", "jump up", "make a phone call/answer phone", "playing with phone/tablet", "typing on a keyboard", "pointing to something with finger", "taking a selfie", "check time (from watch)", "rub two hands together", "nod head/bow", "shake head", "wipe face", "salute", "put the palms together", "cross hands in front (say stop)", "sneeze/cough", "staggering", "falling", "touch head (headache)", "touch chest (stomachache/heart pain)", "touch back (backache)", "touch neck (neckache)", "nausea or vomiting condition", "use a fan (with hand or paper)/feeling warm", "punching/slapping other person", "kicking other person", "pushing other person", "pat on back of other person", "point finger at the other person", "hugging other person", "giving something to other person", "touch other person's pocket", "handshaking", "walking towards each other", "walking apart from each other"], "model": {"type": "SkeletonGCN", "backbone": {"type": "STGCN", "in_channels": 3, "edge_importance_weighting": true, "graph_cfg": {"layout": "coco", "strategy": "spatial"}}, "cls_head": {"type": "STGCNHead", "num_classes": 60, "in_channels": 256, "loss_cls": {"type": "CrossEntropyLoss"}}, "train_cfg": null, "test_cfg": null}, "dataset_type": "VideoDataset", "ann_file_train": "data/train.pkl", "ann_file_val": "data/val.pkl", "train_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "val_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "test_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "data": {"imgs_per_gpu": 16, "workers_per_gpu": 2, "train": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/train.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "val": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "test": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}}, "optimizer": {"type": "SGD", "lr": 1e-05, "momentum": 0.9, "weight_decay": 0.0001, "nesterov": true}, "lr_config": {"policy": "step", "step": [10, 50]}, "total_epochs": 1, "eval_config": {"initial": false, "interval": 1, "gpu_collect": true}, "eval_pipelines": [{"mode": "test", "data": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "dist_eval": true, "evaluators": [{"type": "ClsEvaluator", "topk": [1, 5]}]}], "checkpoint_config": {"interval": 1}, "export": {"type": "raw"}, "work_dir": "work_dir/", "oss_work_dir": null, "gpus": 1}
2023-03-06 14:24:09,966 - easycv - INFO - GPU INFO : Tesla V100-SXM2-32GB
2023-03-06 14:24:09,967 - easycv - INFO - Set random seed to 1942453656, deterministic: False
/home/pai/lib/python3.6/site-packages/easycv/models/loss/cross_entropy_loss.py:273: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``.
  'Default ``avg_non_ignore`` is False, if you would like to '
SkeletonGCN(
  (backbone): STGCN(
    (data_bn): BatchNorm1d(51, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (st_gcn_networks): ModuleList(
      (0): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(3, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (1): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (2): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (3): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (4): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0))
          (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (residual): Sequential(
          (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 1))
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (relu): ReLU(inplace=True)
      )
      (5): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (6): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (7): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(128, 768, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0))
          (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (residual): Sequential(
          (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 1))
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (relu): ReLU(inplace=True)
      )
      (8): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
      (9): STGCNBlock(
        (gcn): ConvTemporalGraphical(
          (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1))
        )
        (tcn): Sequential(
          (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (1): ReLU(inplace=True)
          (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0))
          (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (4): Dropout(p=0, inplace=True)
        )
        (relu): ReLU(inplace=True)
      )
    )
    (edge_importance): ParameterList(
        (0): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (1): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (2): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (3): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (4): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (5): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (6): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (7): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (8): Parameter containing: [torch.FloatTensor of size 3x17x17]
        (9): Parameter containing: [torch.FloatTensor of size 3x17x17]
    )
  )
  (cls_head): STGCNHead(
    (loss_cls): CrossEntropyLoss(avg_non_ignore=False)
    (pool): AdaptiveAvgPool2d(output_size=(1, 1))
    (fc): Conv2d(256, 60, kernel_size=(1, 1), stride=(1, 1))
  )
)
data shuffle: True
2023-03-06 14:24:10,259 - easycv - INFO - 3000 videos remain after valid thresholding
GPU INFO :  Tesla V100-SXM2-32GB
2023-03-06 14:24:12,039 - easycv - INFO - open validate hook
2023-03-06 14:24:12,076 - easycv - INFO - 600 videos remain after valid thresholding
2023-03-06 14:24:12,077 - easycv - INFO - register EvaluationHook {'initial': False, 'evaluators': [<easycv.core.evaluation.classification_eval.ClsEvaluator object at 0x7fba8351f438>]}
2023-03-06 14:24:12,077 - easycv - INFO - load checkpoint from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth
load checkpoint from local path: /root/.cache/easycv/stgcn_80e_ntu60_xsub.pth
2023-03-06 14:24:12,106 - easycv - INFO - Start running, host: root@dsw-229591-7c4869bc45-tvdsm, work_dir: /mnt/workspace/work_dir
2023-03-06 14:24:12,106 - easycv - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) StepLrUpdaterHook                  
(NORMAL      ) CheckpointHook                     
(NORMAL      ) EvalHook                           
(NORMAL      ) BestCkptSaverHook                  
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_epoch:
(VERY_HIGH   ) StepLrUpdaterHook                  
(LOW         ) IterTimerHook                      
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_iter:
(VERY_HIGH   ) StepLrUpdaterHook                  
(LOW         ) IterTimerHook                      
 -------------------- 
after_train_iter:
(ABOVE_NORMAL) OptimizerHook                      
(NORMAL      ) CheckpointHook                     
(LOW         ) IterTimerHook                      
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
after_train_epoch:
(NORMAL      ) CheckpointHook                     
(NORMAL      ) EvalHook                           
(NORMAL      ) BestCkptSaverHook                  
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_val_epoch:
(LOW         ) IterTimerHook                      
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_val_iter:
(LOW         ) IterTimerHook                      
 -------------------- 
after_val_iter:
(LOW         ) IterTimerHook                      
 -------------------- 
after_val_epoch:
(VERY_LOW    ) PreLoggerHook                      
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
after_run:
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
2023-03-06 14:24:12,108 - easycv - INFO - workflow: [('train', 1)], max: 1 epochs
2023-03-06 14:24:12,108 - easycv - INFO - Checkpoints will be saved to /mnt/workspace/work_dir by HardDiskBackend.
Cannot get the env variable of GPU_STATUS_FILE, no data report to scheduler. This is not an error. It is because the scheduler of the cluster did not enable this feature.
2023-03-06 14:24:16,489 - easycv - INFO - Epoch [1][20/187] lr: 1.000e-05, eta: 0:00:35, time: 0.215, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0031, loss: 0.0031
2023-03-06 14:24:19,366 - easycv - INFO - Epoch [1][40/187] lr: 1.000e-05, eta: 0:00:26, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0025, loss: 0.0025
2023-03-06 14:24:22,245 - easycv - INFO - Epoch [1][60/187] lr: 1.000e-05, eta: 0:00:21, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042
2023-03-06 14:24:25,124 - easycv - INFO - Epoch [1][80/187] lr: 1.000e-05, eta: 0:00:17, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0034, loss: 0.0034
2023-03-06 14:24:28,002 - easycv - INFO - Epoch [1][100/187]  lr: 1.000e-05, eta: 0:00:13, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0035, loss: 0.0035
2023-03-06 14:24:30,880 - easycv - INFO - Epoch [1][120/187]  lr: 1.000e-05, eta: 0:00:10, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0037, loss: 0.0037
2023-03-06 14:24:33,760 - easycv - INFO - Epoch [1][140/187]  lr: 1.000e-05, eta: 0:00:07, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042
2023-03-06 14:24:36,643 - easycv - INFO - Epoch [1][160/187]  lr: 1.000e-05, eta: 0:00:04, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027
2023-03-06 14:24:39,527 - easycv - INFO - Epoch [1][180/187]  lr: 1.000e-05, eta: 0:00:01, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027
2023-03-06 14:24:40,490 - easycv - INFO - Saving checkpoint at 1 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 600/600, 134.7 task/s, elapsed: 4s, ETA:     0s
2023-03-06 14:24:45,105 - easycv - INFO - SaveBest metric_name: ['ClsEvaluator_neck_top1']
2023-03-06 14:24:45,106 - easycv - INFO - End SaveBest metric

模型导出

# 查看训练产生的pt文件
!ls  work_dir/*pth
work_dir/epoch_1.pth
!python -m easycv.tools.export  \
configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \
work_dir/epoch_1.pth \
work_dir/video_stgcn.pt
[2023-03-06 14:26:12,711.711 dsw-229591-7c4869bc45-tvdsm:6113 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off.
configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py
/home/pai/lib/python3.6/site-packages/easycv/models/loss/cross_entropy_loss.py:273: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``.
  'Default ``avg_non_ignore`` is False, if you would like to '
load checkpoint from local path: work_dir/epoch_1.pth

端到端预测#

预测结果保存为output_video.mp4文件。

!python skeleton_based_demo.py \
--config=configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \
--checkpoint=work_dir/video_stgcn.pt \
--det-config=configs/detection/yolox/yolox_s_8xb16_300e_coco.py \
--det-checkpoint=pretrained_models/yolox_s.pt \
--pose-config=configs/pose/hrnet_w48_coco_256x192_udp.py \
--pose-checkpoint=pretrained_models/pose_hrnet.pt \
--bbox-thr=0.9 \
--out_file=output_video.mp4
[2023-03-06 14:26:23,995.995 dsw-229591-7c4869bc45-tvdsm:6140 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off.
Download video file from remote to local path "{cache_video_path}"...
100%|██████████████████████████████████████| 1.07M/1.07M [00:00<00:00, 5.11MB/s]
load checkpoint from local path: pretrained_models/pose_hrnet.pt
load checkpoint from local path: work_dir/video_stgcn.pt
reparam: 0
load checkpoint from local path: pretrained_models/yolox_s.pt
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 4.7 task/s, elapsed: 0s, ETA:     0saction label: hugging other person                 ] 0/1, elapsed: 0s, ETA:[                                                  ] 0/1, elapsed: 0s, ETA:[                                                  ] 0/1, elapsed: 0s, ETA:
Moviepy - Building video output_video.mp4.
Moviepy - Writing video output_video.mp4
Moviepy - Done !                                                                
Moviepy - video ready output_video.mp4
Write video to output_video.mp4 successfully!

预测结果可视化:

import cv2
from IPython.display import clear_output, Image, display
video_path = 'output_video.mp4'
video = cv2.VideoCapture(video_path)
while True:
    try:
        clear_output(wait=True)
        # 读取视频
        ret, frame = video.read()
        if not ret:
            break
        _, ret = cv2.imencode('.jpg', frame)
        display(Image(data=ret))
    except KeyboardInterrupt:
        video.release()

image.png

此外,以上所有模型我们都支持了PAI_Blade推理加速功能,更多的模型和功能请关注开源:https://github.com/alibaba/EasyCV

相关实践学习
使用PAI-EAS一键部署ChatGLM及LangChain应用
本场景中主要介绍如何使用模型在线服务(PAI-EAS)部署ChatGLM的AI-Web应用以及启动WebUI进行模型推理,并通过LangChain集成自己的业务数据。
机器学习概览及常见算法
机器学习(Machine Learning, ML)是人工智能的核心,专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能,它是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域。 本课程将带你入门机器学习,掌握机器学习的概念和常用的算法。
相关文章
|
算法 物联网
探索 StableDiffusion:生成高质量图片学习及应用(上)
探索 StableDiffusion:生成高质量图片学习及应用(上)
781 0
|
算法 PyTorch 算法框架/工具
【DSW Gallery】基于EasyCV的视频分类示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文以视频分类为例,为您介绍如何在PAI-DSW中使用EasyCV。
【DSW Gallery】基于EasyCV的视频分类示例
|
算法 PyTorch 算法框架/工具
【DSW Gallery】基于EasyCV的STDC图像语义分割示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文将为您介绍如何在PAI-DSW中使用EasyCV训练轻量化语义分割模型STDC
【DSW Gallery】基于EasyCV的STDC图像语义分割示例
|
人工智能 并行计算 算法
【DSW Gallery】基于MOCOV2的自监督学习示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文以自监督学习-MOCO为例,为您介绍如何在PAI-DSW中使用EasyCV。
【DSW Gallery】基于MOCOV2的自监督学习示例
|
算法 PyTorch 算法框架/工具
【DSW Gallery】基于YOLOX模型和iTAG标注数据的图像检测示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文将为您介绍如何在PAI-DSW中使用EasyCV和PAI-iTAG标注的检测数据训练YOLOX模型。
【DSW Gallery】基于YOLOX模型和iTAG标注数据的图像检测示例
|
机器学习/深度学习 算法
【DSW Gallery】如何使用EasyRec训练DeepFM模型
本文基于EasyRec 0.4.7 展示了如何使用EasyRec快速的训练一个DeepFM模型
【DSW Gallery】如何使用EasyRec训练DeepFM模型
|
并行计算 算法 自动驾驶
【DSW Gallery】基于EasyCV的BEVFormer 3D检测示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文将以BEVFormer 3D检测为例,为您介绍如何在PAI-DSW中使用EasyCV。
【DSW Gallery】基于EasyCV的BEVFormer 3D检测示例
|
文字识别 并行计算 算法
【DSW Gallery】基于EasyCV的文字识别示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文以文字识别为例,为您介绍如何在PAI-DSW中使用EasyCV。
【DSW Gallery】基于EasyCV的文字识别示例
|
人工智能 并行计算 算法
【DSW Gallery】基于MAE的自监督学习示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文自监督学习-MAE为例,为您介绍如何在PAI-DSW中使用EasyCV。
【DSW Gallery】基于MAE的自监督学习示例
|
并行计算 算法 自动驾驶
【DSW Gallery】基于Top Down的关键点检测示例
EasyCV是基于Pytorch,以自监督学习和Transformer技术为核心的 all-in-one 视觉算法建模工具,并包含图像分类,度量学习,目标检测,姿态识别等视觉任务的SOTA算法。本文以关键点检测为例,为您介绍如何在PAI-DSW中使用EasyCV。
【DSW Gallery】基于Top Down的关键点检测示例