直接使用
请打开EasyCV-基于关键点的视频分类示例,并点击右上角 “ 在DSW中打开” 。
EasyCV基于关键点的视频分类-STGCN
人体骨骼关键点对于描述人体姿态,预测人体行为至关重要。因此人体骨骼关键点检测是诸多计算机视觉任务的基础,例如动作分类,异常行为检测,以及自动驾驶等等。近年来,随着深度学习技术的发展,人体骨骼关键点检测效果不断提升,已经开始广泛应用于计算机视觉的相关领域。具体应用主要集中在智能视频监控,病人监护系统,人机交互,虚拟现实,人体动画,智能家居,智能安防,运动员辅助训练等等。
本文将介绍基于骨骼关键点的动作分类解决方案,端到端指导如何在pai-dsw基于EasyCV进行快速开发。
运行环境要求
PAI-Pytorch 1.7/1.8镜像, GPU机型:P100、V100、A100等。
安装依赖包
注:在PAI-DSW docker中无需安装相关依赖,可跳过此步骤1, 在本地notebook环境中执行1,2 步骤安装环境
1、获取torch和cuda版本,并根据版本号修改mmcv安装命令,安装对应版本的mmcv
import torch import os os.environ['CUDA']='cu' + torch.version.cuda.replace('.', '') os.environ['Torch']='torch'+torch.version.__version__.replace('+PAI', '').split('+')[0] !echo $CUDA !echo $Torch
# install some python deps ! pip install --upgrade tqdm ! pip install mmcv-full==1.6.0 -f https://download.openmmlab.com/mmcv/dist/${CUDA}/${Torch}/index.html
2、安装EasyCV算法包
!pip install pai-easycv>=0.10.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
快速体验#
可以执行下面的代码,端到端快速体验模型效果。
首先将skeleton_based_demo.py这个脚本中的代码复制到本地并存储为skeleton_based_demo.py
!sudo apt-get update !sudo apt-get install libgeos-dev !pip install moviepy !python skeleton_based_demo.py
Hit:1 http://mirrors.aliyun.com/ubuntu bionic InRelease Hit:2 http://mirrors.aliyun.com/ubuntu bionic-security InRelease Hit:3 http://mirrors.aliyun.com/ubuntu bionic-updates InRelease Hit:4 http://mirrors.aliyun.com/ubuntu bionic-proposed InRelease Hit:5 http://mirrors.aliyun.com/ubuntu bionic-backports InRelease Reading package lists... Done Reading package lists... Done Building dependency tree Reading state information... Done libgeos-dev is already the newest version (3.6.2-1build2). 0 upgraded, 0 newly installed, 0 to remove and 144 not upgraded. Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Requirement already satisfied: moviepy in /home/pai/lib/python3.6/site-packages (1.0.3) Requirement already satisfied: decorator<5.0,>=4.0.2 in /home/pai/lib/python3.6/site-packages (from moviepy) (4.4.2) Requirement already satisfied: tqdm<5.0,>=4.11.2 in /home/pai/lib/python3.6/site-packages (from moviepy) (4.64.1) Requirement already satisfied: proglog<=1.0.0 in /home/pai/lib/python3.6/site-packages (from moviepy) (0.1.10) Requirement already satisfied: numpy>=1.17.3 in /home/pai/lib/python3.6/site-packages (from moviepy) (1.19.5) Requirement already satisfied: requests<3.0,>=2.8.1 in /home/pai/lib/python3.6/site-packages (from moviepy) (2.27.1) Requirement already satisfied: imageio<3.0,>=2.5 in /home/pai/lib/python3.6/site-packages (from moviepy) (2.9.0) Requirement already satisfied: imageio-ffmpeg>=0.2.0 in /home/pai/lib/python3.6/site-packages (from moviepy) (0.4.8) Requirement already satisfied: pillow in /home/pai/lib/python3.6/site-packages (from imageio<3.0,>=2.5->moviepy) (8.3.2) Requirement already satisfied: idna<4,>=2.5 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (3.3) Requirement already satisfied: charset-normalizer~=2.0.0 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (2.0.4) Requirement already satisfied: certifi>=2017.4.17 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (2021.5.30) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/pai/lib/python3.6/site-packages (from requests<3.0,>=2.8.1->moviepy) (1.26.8) Requirement already satisfied: importlib-resources in /home/pai/lib/python3.6/site-packages (from tqdm<5.0,>=4.11.2->moviepy) (5.4.0) Requirement already satisfied: zipp>=3.1.0 in /home/pai/lib/python3.6/site-packages (from importlib-resources->tqdm<5.0,>=4.11.2->moviepy) (3.6.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv [2023-03-06 14:06:35,521.521 dsw-229591-7c4869bc45-tvdsm:4953 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off. Download video file from remote to local path "{cache_video_path}"... 100%|██████████████████████████████████████| 1.07M/1.07M [00:00<00:00, 5.08MB/s] 100%|████████████████████████████████████████| 243M/243M [00:22<00:00, 11.5MB/s] load checkpoint from local path: /root/.cache/easycv/pose_hrnet_epoch_210_export.pt 100%|██████████████████████████████████████| 11.9M/11.9M [00:00<00:00, 16.4MB/s] load checkpoint from local path: /root/.cache/easycv/stgcn_80e_ntu60_xsub.pth reparam: 0 100%|██████████████████████████████████████| 34.5M/34.5M [00:02<00:00, 13.6MB/s] load checkpoint from local path: /root/.cache/easycv/epoch_300.pt [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 3.4 task/s, elapsed: 0s, ETA: 0saction label: hugging other person ] 0/1, elapsed: 0s, ETA:[ ] 0/1, elapsed: 0s, ETA: Moviepy - Building video ./tmp/demo_show.mp4. Moviepy - Writing video ./tmp/demo_show.mp4 Moviepy - Done ! Moviepy - video ready ./tmp/demo_show.mp4 Write video to ./tmp/demo_show.mp4 successfully!
视频结果可视化:
import cv2 from IPython.display import clear_output, Image, display video_path = 'tmp/demo_show.mp4' video = cv2.VideoCapture(video_path) while True: try: clear_output(wait=True) # 读取视频 ret, frame = video.read() if not ret: break _, ret = cv2.imencode('.jpg', frame) display(Image(data=ret)) except KeyboardInterrupt: video.release()
开发流程
检测模型开发
我们直接使用准备好的模型进行演示,如果想要重新训练检测模型,请参考案例:https://pai.console.aliyun.com/?regionId=cn-hangzhou#/dsw-gallery/preview/deepLearning/cv/easycv_detection_YOLOX
下载模型
!mkdir pretrained_models !wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/detection/yolox/yolox_s_bs16_lr002/epoch_300.pt -O pretrained_models/yolox_s.pt
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file. ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled. --2023-03-06 14:09:56-- http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/detection/yolox/yolox_s_bs16_lr002/epoch_300.pt Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13 Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 36133977 (34M) [application/octet-stream] Saving to: ‘pretrained_models/yolox_s.pt’ pretrained_models/y 100%[===================>] 34.46M 13.2MB/s in 2.6s 2023-03-06 14:09:58 (13.2 MB/s) - ‘pretrained_models/yolox_s.pt’ saved [36133977/36133977]
模型推理&可视化(可选)#
获取推理结果:
from easycv.predictors import YoloXPredictor det_predictor = YoloXPredictor(model_path='pretrained_models/yolox_s.pt', score_thresh=0.9) img = 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/images/two_person.jpg' results = det_predictor(img)[0] print(f'Detection results: {results}')
[2023-03-06 14:11:09,474.474 dsw-229591-7c4869bc45-tvdsm:4613 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off.
reparam: 0 load checkpoint from local path: pretrained_models/yolox_s.pt [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 0.7 task/s, elapsed: 1s, ETA: 0sDetection results: {'detection_boxes': array([[393.29456 , 106.736565, 514.5323 , 381.98608 ], [300.8314 , 125.51828 , 398.19537 , 411.52783 ]], dtype=float32), 'detection_scores': array([0.9170087, 0.9002314], dtype=float32), 'detection_classes': array([0, 0], dtype=int32), 'img_metas': {'filename': 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/images/two_person.jpg', 'ori_img_shape': (480, 853, 3), 'img_shape': (360, 640, 3), 'scale_factor': array([0.7502931, 0.75 , 0.7502931, 0.75 ], dtype=float32), 'pad': 0.0, 'img_norm_cfg': {'mean': array([123.675, 116.28 , 103.53 ], dtype=float32), 'std': array([58.395, 57.12 , 57.375], dtype=float32), 'to_rgb': True}}, 'detection_class_names': ['person', 'person'], 'ori_img_shape': [480, 853]}
推理结果可视化:
import cv2 from IPython.display import clear_output, Image, display from easycv.file.image import load_image img_np = load_image(img) detection_boxes = results['detection_boxes'] for box in detection_boxes: left_top = (int(box[0]), int(box[1])) right_bottom = (int(box[2]), int(box[3])) cv2.rectangle(img_np, left_top, right_bottom, (0, 255, 0), thickness=1) _, ret = cv2.imencode('.jpg', img_np) display(Image(data=ret))
关键点模型开发
我们直接使用准备好的模型演示,如果想要重新训练关键点模型,请参考案例:https://pai.console.aliyun.com/?regionId=cn-hangzhou#/dsw-gallery/preview/deepLearning/cv/easycv_pose_topdown_hrnet
下载模型
!wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/pose/top_down_hrnet/pose_hrnet_epoch_210_export.pt -O pretrained_models/pose_hrnet.pt
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file. ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled. --2023-03-06 14:12:14-- http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/pose/top_down_hrnet/pose_hrnet_epoch_210_export.pt Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13 Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 255086694 (243M) [application/octet-stream] Saving to: ‘pretrained_models/pose_hrnet.pt’ pretrained_models/p 100%[===================>] 243.27M 11.8MB/s in 20s 2023-03-06 14:12:34 (12.1 MB/s) - ‘pretrained_models/pose_hrnet.pt’ saved [255086694/255086694]
模型推理&可视化(可选)
获取推理结果:
from easycv.predictors import PoseTopDownPredictor pose_predictor = PoseTopDownPredictor( model_path='pretrained_models/pose_hrnet.pt', detection_predictor_config=dict( type='YoloXPredictor', model_path='pretrained_models/yolox_s.pt', ), bbox_thr=0.9, cat_id=0, # person category id ) img = 'http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/images/two_person.jpg' results = pose_predictor(img)[0] print(f'Pose(Keypoints) results: {results}')
load checkpoint from local path: pretrained_models/pose_hrnet.pt reparam: 0 load checkpoint from local path: pretrained_models/yolox_s.pt [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 3.1 task/s, elapsed: 0s, ETA: 0s ] 0/1, elapsed: 0s, ETA:Pose(Keypoints) results: {'keypoints': array([[[446.52234 , 139.24068 , 0.9669522 ], [452.1864 , 134.07887 , 0.9656511 ], [445.32617 , 133.5937 , 0.9783832 ], [469.97046 , 136.4598 , 0.96018004], [448.53107 , 135.00523 , 0.6578802 ], [484.84204 , 174.96892 , 0.9387574 ], [437.69775 , 167.76198 , 0.91668504], [497.69324 , 218.9604 , 0.90464604], [419.41394 , 205.68068 , 0.8883008 ], [500.0896 , 258.17426 , 0.94137996], [403.52203 , 234.57295 , 0.8803861 ], [464.09875 , 259.55505 , 0.8396832 ], [431.8305 , 256.75336 , 0.8154782 ], [461.54712 , 318.659 , 0.92906725], [430.77502 , 316.80615 , 0.9279456 ], [460.7788 , 363.59674 , 0.89747685], [428.1001 , 363.88953 , 0.8918172 ]], [[358.0172 , 155.05956 , 0.97038084], [355.98962 , 149.00755 , 0.92078936], [352.4453 , 149.90215 , 0.97667825], [340.4132 , 152.0797 , 0.606572 ], [334.53314 , 154.22981 , 0.9548439 ], [348.04376 , 183.88353 , 0.90829027], [322.1966 , 191.77852 , 0.9267349 ], [351.15045 , 232.45781 , 0.89447075], [330.02106 , 242.5197 , 0.90907884], [369.1089 , 261.89404 , 0.8053842 ], [350.1405 , 280.6457 , 0.82891107], [353.85364 , 267.27197 , 0.7821094 ], [330.16003 , 274.84985 , 0.8436185 ], [360.4154 , 323.65887 , 0.88485605], [320.79358 , 332.54346 , 0.9094477 ], [370.22504 , 369.16022 , 0.941416 ], [314.28442 , 381.05402 , 0.9005176 ]]], dtype=float32), 'bbox': array([[350.60413 , 106.105804, 557.2877 , 381.6839 , 1. ], [242.01334 , 124.487015, 457.23285 , 411.44635 , 1. ]], dtype=float32)}
推理结果可视化:
from IPython.display import clear_output, Image, display from easycv.file.image import load_image img_np = load_image(img) show_img = pose_predictor.show_result( img_np, results, ) _, ret = cv2.imencode('.jpg', show_img) display(Image(data=ret))
视频分类模型开发
数据准备
我们提供了小型视频关键点的数据集供测试。
!mkdir data # 下载训练集 !wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_train_3000_samples.pkl -O data/train.pkl # 下载测试集 !wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_val_600_samples.pkl -O data/val.pkl
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file. ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled. --2023-03-06 14:13:44-- http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_train_3000_samples.pkl Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13 Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 36143216 (34M) [application/octet-stream] Saving to: ‘data/train.pkl’ data/train.pkl 100%[===================>] 34.47M 10.8MB/s in 3.2s 2023-03-06 14:13:47 (10.8 MB/s) - ‘data/train.pkl’ saved [36143216/36143216] Will not apply HSTS. The HSTS database must be a regular and non-world-writable file. ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled. --2023-03-06 14:13:48-- http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/demos/datasets/skeleton_dataset/ntu60_xsub_val_600_samples.pkl Resolving pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com... 39.98.20.13 Connecting to pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com|39.98.20.13|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 7748162 (7.4M) [application/octet-stream] Saving to: ‘data/val.pkl’ data/val.pkl 100%[===================>] 7.39M 14.5MB/s in 0.5s 2023-03-06 14:13:48 (14.5 MB/s) - ‘data/val.pkl’ saved [7748162/7748162]
训练模型
为了快速验证功能,这里我们将epoch数量和learning rate都调小,并加载预训练模型,方便快速生成结果。 如果有自定义数据的需求,相关参数还需要进行调整。
!python -m easycv.tools.train \ configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \ --work_dir work_dir/ \ --load_from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth \ --user_config_params ann_file_train=data/train.pkl ann_file_val=data/val.pkl optimizer.lr=0.00001 log_config.interval=20 total_epochs=1
[2023-03-06 14:24:06,509.509 dsw-229591-7c4869bc45-tvdsm:5999 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off. /home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:37: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting OMP_NUM_THREADS environment variable for each process ' /home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:47: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting MKL_NUM_THREADS environment variable for each process ' 2023-03-06 14:24:09,964 - easycv - INFO - Environment info: ------------------------------------------------------------ sys.platform: linux Python: 3.6.12 |Anaconda, Inc.| (default, Sep 8 2020, 23:10:56) [GCC 7.3.0] CUDA available: True CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.243 GPU 0: Tesla V100-SXM2-32GB GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.8.2+PAI PyTorch compiling details: PyTorch built with: - GCC 7.5 - C++ Version: 201402 - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v1.7.0 (Git Hash N/A) - OpenMP 201511 (a.k.a. OpenMP 4.5) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 10.1 - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_75,code=compute_75 - CuDNN 7.6.5 - Magma 2.5.2 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.5, CXX_COMPILER=/usr/lib/ccache/c++, CXX_FLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, TorchVision: 0.9.2+cu101 OpenCV: 4.4.0 MMCV: 1.6.0 EasyCV: 0.10.0 ------------------------------------------------------------ 2023-03-06 14:24:09,965 - easycv - INFO - Distributed training: False 2023-03-06 14:24:09,965 - easycv - INFO - Config: /home/pai/lib/python3.6/site-packages/easycv/configs/base.py train_cfg = {} test_cfg = {} optimizer_config = dict() # grad_clip, coalesce, bucket_size_mb # yapf:disable log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'), # dict(type='TensorboardLoggerHook') ]) # yapf:enable # runtime settings dist_params = dict(backend='nccl') cudnn_benchmark = False log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] /home/pai/lib/python3.6/site-packages/easycv/configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py _base_ = 'configs/base.py' CLASSES = [ 'drink water', 'eat meal/snack', 'brushing teeth', 'brushing hair', 'drop', 'pickup', 'throw', 'sitting down', 'standing up (from sitting position)', 'clapping', 'reading', 'writing', 'tear up paper', 'wear jacket', 'take off jacket', 'wear a shoe', 'take off a shoe', 'wear on glasses', 'take off glasses', 'put on a hat/cap', 'take off a hat/cap', 'cheer up', 'hand waving', 'kicking something', 'reach into pocket', 'hopping (one foot jumping)', 'jump up', 'make a phone call/answer phone', 'playing with phone/tablet', 'typing on a keyboard', 'pointing to something with finger', 'taking a selfie', 'check time (from watch)', 'rub two hands together', 'nod head/bow', 'shake head', 'wipe face', 'salute', 'put the palms together', 'cross hands in front (say stop)', 'sneeze/cough', 'staggering', 'falling', 'touch head (headache)', 'touch chest (stomachache/heart pain)', 'touch back (backache)', 'touch neck (neckache)', 'nausea or vomiting condition', 'use a fan (with hand or paper)/feeling warm', 'punching/slapping other person', 'kicking other person', 'pushing other person', 'pat on back of other person', 'point finger at the other person', 'hugging other person', 'giving something to other person', "touch other person's pocket", 'handshaking', 'walking towards each other', 'walking apart from each other' ] model = dict( type='SkeletonGCN', backbone=dict( type='STGCN', in_channels=3, edge_importance_weighting=True, graph_cfg=dict(layout='coco', strategy='spatial')), cls_head=dict( type='STGCNHead', num_classes=60, in_channels=256, loss_cls=dict(type='CrossEntropyLoss')), train_cfg=None, test_cfg=None) dataset_type = 'VideoDataset' ann_file_train = 'data/posec3d/ntu60_xsub_train.pkl' ann_file_val = 'data/posec3d/ntu60_xsub_val.pkl' train_pipeline = [ dict(type='PaddingWithLoop', clip_len=300), dict(type='PoseDecode'), dict(type='FormatGCNInput', input_format='NCTVM'), dict(type='PoseNormalize'), dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]), dict(type='VideoToTensor', keys=['keypoint']) ] val_pipeline = [ dict(type='PaddingWithLoop', clip_len=300), dict(type='PoseDecode'), dict(type='FormatGCNInput', input_format='NCTVM'), dict(type='PoseNormalize'), dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]), dict(type='VideoToTensor', keys=['keypoint']) ] test_pipeline = [ dict(type='PaddingWithLoop', clip_len=300), dict(type='PoseDecode'), dict(type='FormatGCNInput', input_format='NCTVM'), dict(type='PoseNormalize'), dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]), dict(type='VideoToTensor', keys=['keypoint']) ] data = dict( imgs_per_gpu=16, workers_per_gpu=2, train=dict( type=dataset_type, data_source=dict( type='PoseDataSourceForVideoRec', ann_file=ann_file_train, data_prefix='', ), pipeline=train_pipeline), val=dict( type=dataset_type, imgs_per_gpu=1, data_source=dict( type='PoseDataSourceForVideoRec', ann_file=ann_file_val, data_prefix='', ), pipeline=val_pipeline), test=dict( type=dataset_type, data_source=dict( type='PoseDataSourceForVideoRec', ann_file=ann_file_val, data_prefix='', ), pipeline=test_pipeline)) # optimizer optimizer = dict( type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001, nesterov=True) optimizer_config = dict(grad_clip=None) # learning policy lr_config = dict(policy='step', step=[10, 50]) total_epochs = 80 # eval eval_config = dict(initial=False, interval=1, gpu_collect=True) eval_pipelines = [ dict( mode='test', data=data['val'], dist_eval=True, evaluators=[dict(type='ClsEvaluator', topk=(1, 5))], ) ] log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')]) checkpoint_config = dict(interval=1) export = dict(type='raw') # export = dict(type='jit') # export = dict( # type='blade', # blade_config=dict( # enable_fp16=True, # fp16_fallback_op_ratio=0.0, # customize_op_black_list=[ # 'aten::select', 'aten::index', 'aten::slice', 'aten::view', # 'aten::upsample', 'aten::clamp', 'aten::clone' # ])) 2023-03-06 14:24:09,965 - easycv - INFO - Config Dict: {"train_cfg": {}, "test_cfg": {}, "optimizer_config": {"grad_clip": null}, "log_config": {"interval": 20, "hooks": [{"type": "TextLoggerHook"}]}, "dist_params": {"backend": "nccl"}, "cudnn_benchmark": false, "log_level": "INFO", "load_from": "http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth", "resume_from": null, "workflow": [["train", 1]], "CLASSES": ["drink water", "eat meal/snack", "brushing teeth", "brushing hair", "drop", "pickup", "throw", "sitting down", "standing up (from sitting position)", "clapping", "reading", "writing", "tear up paper", "wear jacket", "take off jacket", "wear a shoe", "take off a shoe", "wear on glasses", "take off glasses", "put on a hat/cap", "take off a hat/cap", "cheer up", "hand waving", "kicking something", "reach into pocket", "hopping (one foot jumping)", "jump up", "make a phone call/answer phone", "playing with phone/tablet", "typing on a keyboard", "pointing to something with finger", "taking a selfie", "check time (from watch)", "rub two hands together", "nod head/bow", "shake head", "wipe face", "salute", "put the palms together", "cross hands in front (say stop)", "sneeze/cough", "staggering", "falling", "touch head (headache)", "touch chest (stomachache/heart pain)", "touch back (backache)", "touch neck (neckache)", "nausea or vomiting condition", "use a fan (with hand or paper)/feeling warm", "punching/slapping other person", "kicking other person", "pushing other person", "pat on back of other person", "point finger at the other person", "hugging other person", "giving something to other person", "touch other person's pocket", "handshaking", "walking towards each other", "walking apart from each other"], "model": {"type": "SkeletonGCN", "backbone": {"type": "STGCN", "in_channels": 3, "edge_importance_weighting": true, "graph_cfg": {"layout": "coco", "strategy": "spatial"}}, "cls_head": {"type": "STGCNHead", "num_classes": 60, "in_channels": 256, "loss_cls": {"type": "CrossEntropyLoss"}}, "train_cfg": null, "test_cfg": null}, "dataset_type": "VideoDataset", "ann_file_train": "data/train.pkl", "ann_file_val": "data/val.pkl", "train_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "val_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "test_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "data": {"imgs_per_gpu": 16, "workers_per_gpu": 2, "train": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/train.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "val": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "test": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}}, "optimizer": {"type": "SGD", "lr": 1e-05, "momentum": 0.9, "weight_decay": 0.0001, "nesterov": true}, "lr_config": {"policy": "step", "step": [10, 50]}, "total_epochs": 1, "eval_config": {"initial": false, "interval": 1, "gpu_collect": true}, "eval_pipelines": [{"mode": "test", "data": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "dist_eval": true, "evaluators": [{"type": "ClsEvaluator", "topk": [1, 5]}]}], "checkpoint_config": {"interval": 1}, "export": {"type": "raw"}, "work_dir": "work_dir/", "oss_work_dir": null, "gpus": 1} 2023-03-06 14:24:09,966 - easycv - INFO - GPU INFO : Tesla V100-SXM2-32GB 2023-03-06 14:24:09,967 - easycv - INFO - Set random seed to 1942453656, deterministic: False /home/pai/lib/python3.6/site-packages/easycv/models/loss/cross_entropy_loss.py:273: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``. 'Default ``avg_non_ignore`` is False, if you would like to ' SkeletonGCN( (backbone): STGCN( (data_bn): BatchNorm1d(51, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (st_gcn_networks): ModuleList( (0): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(3, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (1): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (2): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (3): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (4): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0)) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (residual): Sequential( (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 1)) (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (relu): ReLU(inplace=True) ) (5): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (6): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (7): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(128, 768, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0)) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (residual): Sequential( (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 1)) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (relu): ReLU(inplace=True) ) (8): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (9): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) ) (edge_importance): ParameterList( (0): Parameter containing: [torch.FloatTensor of size 3x17x17] (1): Parameter containing: [torch.FloatTensor of size 3x17x17] (2): Parameter containing: [torch.FloatTensor of size 3x17x17] (3): Parameter containing: [torch.FloatTensor of size 3x17x17] (4): Parameter containing: [torch.FloatTensor of size 3x17x17] (5): Parameter containing: [torch.FloatTensor of size 3x17x17] (6): Parameter containing: [torch.FloatTensor of size 3x17x17] (7): Parameter containing: [torch.FloatTensor of size 3x17x17] (8): Parameter containing: [torch.FloatTensor of size 3x17x17] (9): Parameter containing: [torch.FloatTensor of size 3x17x17] ) ) (cls_head): STGCNHead( (loss_cls): CrossEntropyLoss(avg_non_ignore=False) (pool): AdaptiveAvgPool2d(output_size=(1, 1)) (fc): Conv2d(256, 60, kernel_size=(1, 1), stride=(1, 1)) ) ) data shuffle: True 2023-03-06 14:24:10,259 - easycv - INFO - 3000 videos remain after valid thresholding GPU INFO : Tesla V100-SXM2-32GB 2023-03-06 14:24:12,039 - easycv - INFO - open validate hook 2023-03-06 14:24:12,076 - easycv - INFO - 600 videos remain after valid thresholding 2023-03-06 14:24:12,077 - easycv - INFO - register EvaluationHook {'initial': False, 'evaluators': [<easycv.core.evaluation.classification_eval.ClsEvaluator object at 0x7fba8351f438>]} 2023-03-06 14:24:12,077 - easycv - INFO - load checkpoint from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth load checkpoint from local path: /root/.cache/easycv/stgcn_80e_ntu60_xsub.pth 2023-03-06 14:24:12,106 - easycv - INFO - Start running, host: root@dsw-229591-7c4869bc45-tvdsm, work_dir: /mnt/workspace/work_dir 2023-03-06 14:24:12,106 - easycv - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) CheckpointHook (NORMAL ) EvalHook (NORMAL ) BestCkptSaverHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook (LOW ) IterTimerHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_train_iter: (VERY_HIGH ) StepLrUpdaterHook (LOW ) IterTimerHook -------------------- after_train_iter: (ABOVE_NORMAL) OptimizerHook (NORMAL ) CheckpointHook (LOW ) IterTimerHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- after_train_epoch: (NORMAL ) CheckpointHook (NORMAL ) EvalHook (NORMAL ) BestCkptSaverHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_val_epoch: (LOW ) IterTimerHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_val_iter: (LOW ) IterTimerHook -------------------- after_val_iter: (LOW ) IterTimerHook -------------------- after_val_epoch: (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- after_run: (VERY_LOW ) TextLoggerHook -------------------- 2023-03-06 14:24:12,108 - easycv - INFO - workflow: [('train', 1)], max: 1 epochs 2023-03-06 14:24:12,108 - easycv - INFO - Checkpoints will be saved to /mnt/workspace/work_dir by HardDiskBackend. Cannot get the env variable of GPU_STATUS_FILE, no data report to scheduler. This is not an error. It is because the scheduler of the cluster did not enable this feature. 2023-03-06 14:24:16,489 - easycv - INFO - Epoch [1][20/187] lr: 1.000e-05, eta: 0:00:35, time: 0.215, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0031, loss: 0.0031 2023-03-06 14:24:19,366 - easycv - INFO - Epoch [1][40/187] lr: 1.000e-05, eta: 0:00:26, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0025, loss: 0.0025 2023-03-06 14:24:22,245 - easycv - INFO - Epoch [1][60/187] lr: 1.000e-05, eta: 0:00:21, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042 2023-03-06 14:24:25,124 - easycv - INFO - Epoch [1][80/187] lr: 1.000e-05, eta: 0:00:17, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0034, loss: 0.0034 2023-03-06 14:24:28,002 - easycv - INFO - Epoch [1][100/187] lr: 1.000e-05, eta: 0:00:13, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0035, loss: 0.0035 2023-03-06 14:24:30,880 - easycv - INFO - Epoch [1][120/187] lr: 1.000e-05, eta: 0:00:10, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0037, loss: 0.0037 2023-03-06 14:24:33,760 - easycv - INFO - Epoch [1][140/187] lr: 1.000e-05, eta: 0:00:07, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042 2023-03-06 14:24:36,643 - easycv - INFO - Epoch [1][160/187] lr: 1.000e-05, eta: 0:00:04, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027 2023-03-06 14:24:39,527 - easycv - INFO - Epoch [1][180/187] lr: 1.000e-05, eta: 0:00:01, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027 2023-03-06 14:24:40,490 - easycv - INFO - Saving checkpoint at 1 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 600/600, 134.7 task/s, elapsed: 4s, ETA: 0s 2023-03-06 14:24:45,105 - easycv - INFO - SaveBest metric_name: ['ClsEvaluator_neck_top1'] 2023-03-06 14:24:45,106 - easycv - INFO - End SaveBest metric 2023-03-06 14:24:45,106 - easycv - INFO - Epoch(val) [1][187] prob_top1: 87.0000, prob_top5: 98.0000
训练模型
为了快速验证功能,这里我们将epoch数量和learning rate都调小,并加载预训练模型,方便快速生成结果。 如果有自定义数据的需求,相关参数还需要进行调整。
!python -m easycv.tools.train \ configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \ --work_dir work_dir/ \ --load_from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth \ --user_config_params ann_file_train=data/train.pkl ann_file_val=data/val.pkl optimizer.lr=0.00001 log_config.interval=20 total_epochs=1
[2023-03-06 14:24:06,509.509 dsw-229591-7c4869bc45-tvdsm:5999 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off. /home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:37: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting OMP_NUM_THREADS environment variable for each process ' /home/pai/lib/python3.6/site-packages/easycv/utils/setup_env.py:47: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting MKL_NUM_THREADS environment variable for each process ' 2023-03-06 14:24:09,964 - easycv - INFO - Environment info: ------------------------------------------------------------ sys.platform: linux Python: 3.6.12 |Anaconda, Inc.| (default, Sep 8 2020, 23:10:56) [GCC 7.3.0] CUDA available: True CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.243 GPU 0: Tesla V100-SXM2-32GB GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.8.2+PAI PyTorch compiling details: PyTorch built with: - GCC 7.5 - C++ Version: 201402 - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v1.7.0 (Git Hash N/A) - OpenMP 201511 (a.k.a. OpenMP 4.5) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 10.1 - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_75,code=compute_75 - CuDNN 7.6.5 - Magma 2.5.2 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.5, CXX_COMPILER=/usr/lib/ccache/c++, CXX_FLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, TorchVision: 0.9.2+cu101 OpenCV: 4.4.0 MMCV: 1.6.0 EasyCV: 0.10.0 ------------------------------------------------------------ 2023-03-06 14:24:09,965 - easycv - INFO - Distributed training: False 2023-03-06 14:24:09,965 - easycv - INFO - Config: /home/pai/lib/python3.6/site-packages/easycv/configs/base.py train_cfg = {} test_cfg = {} optimizer_config = dict() # grad_clip, coalesce, bucket_size_mb # yapf:disable log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'), # dict(type='TensorboardLoggerHook') ]) # yapf:enable # runtime settings dist_params = dict(backend='nccl') cudnn_benchmark = False log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] /home/pai/lib/python3.6/site-packages/easycv/configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py _base_ = 'configs/base.py' CLASSES = [ 'drink water', 'eat meal/snack', 'brushing teeth', 'brushing hair', 'drop', 'pickup', 'throw', 'sitting down', 'standing up (from sitting position)', 'clapping', 'reading', 'writing', 'tear up paper', 'wear jacket', 'take off jacket', 'wear a shoe', 'take off a shoe', 'wear on glasses', 'take off glasses', 'put on a hat/cap', 'take off a hat/cap', 'cheer up', 'hand waving', 'kicking something', 'reach into pocket', 'hopping (one foot jumping)', 'jump up', 'make a phone call/answer phone', 'playing with phone/tablet', 'typing on a keyboard', 'pointing to something with finger', 'taking a selfie', 'check time (from watch)', 'rub two hands together', 'nod head/bow', 'shake head', 'wipe face', 'salute', 'put the palms together', 'cross hands in front (say stop)', 'sneeze/cough', 'staggering', 'falling', 'touch head (headache)', 'touch chest (stomachache/heart pain)', 'touch back (backache)', 'touch neck (neckache)', 'nausea or vomiting condition', 'use a fan (with hand or paper)/feeling warm', 'punching/slapping other person', 'kicking other person', 'pushing other person', 'pat on back of other person', 'point finger at the other person', 'hugging other person', 'giving something to other person', "touch other person's pocket", 'handshaking', 'walking towards each other', 'walking apart from each other' ] model = dict( type='SkeletonGCN', backbone=dict( type='STGCN', in_channels=3, edge_importance_weighting=True, graph_cfg=dict(layout='coco', strategy='spatial')), cls_head=dict( type='STGCNHead', num_classes=60, in_channels=256, loss_cls=dict(type='CrossEntropyLoss')), train_cfg=None, test_cfg=None) dataset_type = 'VideoDataset' ann_file_train = 'data/posec3d/ntu60_xsub_train.pkl' ann_file_val = 'data/posec3d/ntu60_xsub_val.pkl' train_pipeline = [ dict(type='PaddingWithLoop', clip_len=300), dict(type='PoseDecode'), dict(type='FormatGCNInput', input_format='NCTVM'), dict(type='PoseNormalize'), dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]), dict(type='VideoToTensor', keys=['keypoint']) ] val_pipeline = [ dict(type='PaddingWithLoop', clip_len=300), dict(type='PoseDecode'), dict(type='FormatGCNInput', input_format='NCTVM'), dict(type='PoseNormalize'), dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]), dict(type='VideoToTensor', keys=['keypoint']) ] test_pipeline = [ dict(type='PaddingWithLoop', clip_len=300), dict(type='PoseDecode'), dict(type='FormatGCNInput', input_format='NCTVM'), dict(type='PoseNormalize'), dict(type='Collect', keys=['keypoint', 'label'], meta_keys=[]), dict(type='VideoToTensor', keys=['keypoint']) ] data = dict( imgs_per_gpu=16, workers_per_gpu=2, train=dict( type=dataset_type, data_source=dict( type='PoseDataSourceForVideoRec', ann_file=ann_file_train, data_prefix='', ), pipeline=train_pipeline), val=dict( type=dataset_type, imgs_per_gpu=1, data_source=dict( type='PoseDataSourceForVideoRec', ann_file=ann_file_val, data_prefix='', ), pipeline=val_pipeline), test=dict( type=dataset_type, data_source=dict( type='PoseDataSourceForVideoRec', ann_file=ann_file_val, data_prefix='', ), pipeline=test_pipeline)) # optimizer optimizer = dict( type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001, nesterov=True) optimizer_config = dict(grad_clip=None) # learning policy lr_config = dict(policy='step', step=[10, 50]) total_epochs = 80 # eval eval_config = dict(initial=False, interval=1, gpu_collect=True) eval_pipelines = [ dict( mode='test', data=data['val'], dist_eval=True, evaluators=[dict(type='ClsEvaluator', topk=(1, 5))], ) ] log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')]) checkpoint_config = dict(interval=1) export = dict(type='raw') # export = dict(type='jit') # export = dict( # type='blade', # blade_config=dict( # enable_fp16=True, # fp16_fallback_op_ratio=0.0, # customize_op_black_list=[ # 'aten::select', 'aten::index', 'aten::slice', 'aten::view', # 'aten::upsample', 'aten::clamp', 'aten::clone' # ])) 2023-03-06 14:24:09,965 - easycv - INFO - Config Dict: {"train_cfg": {}, "test_cfg": {}, "optimizer_config": {"grad_clip": null}, "log_config": {"interval": 20, "hooks": [{"type": "TextLoggerHook"}]}, "dist_params": {"backend": "nccl"}, "cudnn_benchmark": false, "log_level": "INFO", "load_from": "http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth", "resume_from": null, "workflow": [["train", 1]], "CLASSES": ["drink water", "eat meal/snack", "brushing teeth", "brushing hair", "drop", "pickup", "throw", "sitting down", "standing up (from sitting position)", "clapping", "reading", "writing", "tear up paper", "wear jacket", "take off jacket", "wear a shoe", "take off a shoe", "wear on glasses", "take off glasses", "put on a hat/cap", "take off a hat/cap", "cheer up", "hand waving", "kicking something", "reach into pocket", "hopping (one foot jumping)", "jump up", "make a phone call/answer phone", "playing with phone/tablet", "typing on a keyboard", "pointing to something with finger", "taking a selfie", "check time (from watch)", "rub two hands together", "nod head/bow", "shake head", "wipe face", "salute", "put the palms together", "cross hands in front (say stop)", "sneeze/cough", "staggering", "falling", "touch head (headache)", "touch chest (stomachache/heart pain)", "touch back (backache)", "touch neck (neckache)", "nausea or vomiting condition", "use a fan (with hand or paper)/feeling warm", "punching/slapping other person", "kicking other person", "pushing other person", "pat on back of other person", "point finger at the other person", "hugging other person", "giving something to other person", "touch other person's pocket", "handshaking", "walking towards each other", "walking apart from each other"], "model": {"type": "SkeletonGCN", "backbone": {"type": "STGCN", "in_channels": 3, "edge_importance_weighting": true, "graph_cfg": {"layout": "coco", "strategy": "spatial"}}, "cls_head": {"type": "STGCNHead", "num_classes": 60, "in_channels": 256, "loss_cls": {"type": "CrossEntropyLoss"}}, "train_cfg": null, "test_cfg": null}, "dataset_type": "VideoDataset", "ann_file_train": "data/train.pkl", "ann_file_val": "data/val.pkl", "train_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "val_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "test_pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}], "data": {"imgs_per_gpu": 16, "workers_per_gpu": 2, "train": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/train.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "val": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "test": {"type": "VideoDataset", "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}}, "optimizer": {"type": "SGD", "lr": 1e-05, "momentum": 0.9, "weight_decay": 0.0001, "nesterov": true}, "lr_config": {"policy": "step", "step": [10, 50]}, "total_epochs": 1, "eval_config": {"initial": false, "interval": 1, "gpu_collect": true}, "eval_pipelines": [{"mode": "test", "data": {"type": "VideoDataset", "imgs_per_gpu": 1, "data_source": {"type": "PoseDataSourceForVideoRec", "ann_file": "data/val.pkl", "data_prefix": ""}, "pipeline": [{"type": "PaddingWithLoop", "clip_len": 300}, {"type": "PoseDecode"}, {"type": "FormatGCNInput", "input_format": "NCTVM"}, {"type": "PoseNormalize"}, {"type": "Collect", "keys": ["keypoint", "label"], "meta_keys": []}, {"type": "VideoToTensor", "keys": ["keypoint"]}]}, "dist_eval": true, "evaluators": [{"type": "ClsEvaluator", "topk": [1, 5]}]}], "checkpoint_config": {"interval": 1}, "export": {"type": "raw"}, "work_dir": "work_dir/", "oss_work_dir": null, "gpus": 1} 2023-03-06 14:24:09,966 - easycv - INFO - GPU INFO : Tesla V100-SXM2-32GB 2023-03-06 14:24:09,967 - easycv - INFO - Set random seed to 1942453656, deterministic: False /home/pai/lib/python3.6/site-packages/easycv/models/loss/cross_entropy_loss.py:273: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``. 'Default ``avg_non_ignore`` is False, if you would like to ' SkeletonGCN( (backbone): STGCN( (data_bn): BatchNorm1d(51, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (st_gcn_networks): ModuleList( (0): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(3, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (1): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (2): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (3): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (4): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0)) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (residual): Sequential( (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 1)) (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (relu): ReLU(inplace=True) ) (5): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (6): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(128, 384, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(128, 128, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (7): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(128, 768, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(2, 1), padding=(4, 0)) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (residual): Sequential( (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 1)) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (relu): ReLU(inplace=True) ) (8): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) (9): STGCNBlock( (gcn): ConvTemporalGraphical( (conv): Conv2d(256, 768, kernel_size=(1, 1), stride=(1, 1)) ) (tcn): Sequential( (0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (1): ReLU(inplace=True) (2): Conv2d(256, 256, kernel_size=(9, 1), stride=(1, 1), padding=(4, 0)) (3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (4): Dropout(p=0, inplace=True) ) (relu): ReLU(inplace=True) ) ) (edge_importance): ParameterList( (0): Parameter containing: [torch.FloatTensor of size 3x17x17] (1): Parameter containing: [torch.FloatTensor of size 3x17x17] (2): Parameter containing: [torch.FloatTensor of size 3x17x17] (3): Parameter containing: [torch.FloatTensor of size 3x17x17] (4): Parameter containing: [torch.FloatTensor of size 3x17x17] (5): Parameter containing: [torch.FloatTensor of size 3x17x17] (6): Parameter containing: [torch.FloatTensor of size 3x17x17] (7): Parameter containing: [torch.FloatTensor of size 3x17x17] (8): Parameter containing: [torch.FloatTensor of size 3x17x17] (9): Parameter containing: [torch.FloatTensor of size 3x17x17] ) ) (cls_head): STGCNHead( (loss_cls): CrossEntropyLoss(avg_non_ignore=False) (pool): AdaptiveAvgPool2d(output_size=(1, 1)) (fc): Conv2d(256, 60, kernel_size=(1, 1), stride=(1, 1)) ) ) data shuffle: True 2023-03-06 14:24:10,259 - easycv - INFO - 3000 videos remain after valid thresholding GPU INFO : Tesla V100-SXM2-32GB 2023-03-06 14:24:12,039 - easycv - INFO - open validate hook 2023-03-06 14:24:12,076 - easycv - INFO - 600 videos remain after valid thresholding 2023-03-06 14:24:12,077 - easycv - INFO - register EvaluationHook {'initial': False, 'evaluators': [<easycv.core.evaluation.classification_eval.ClsEvaluator object at 0x7fba8351f438>]} 2023-03-06 14:24:12,077 - easycv - INFO - load checkpoint from http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/EasyCV/modelzoo/video/skeleton_based/stgcn/stgcn_80e_ntu60_xsub.pth load checkpoint from local path: /root/.cache/easycv/stgcn_80e_ntu60_xsub.pth 2023-03-06 14:24:12,106 - easycv - INFO - Start running, host: root@dsw-229591-7c4869bc45-tvdsm, work_dir: /mnt/workspace/work_dir 2023-03-06 14:24:12,106 - easycv - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) CheckpointHook (NORMAL ) EvalHook (NORMAL ) BestCkptSaverHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook (LOW ) IterTimerHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_train_iter: (VERY_HIGH ) StepLrUpdaterHook (LOW ) IterTimerHook -------------------- after_train_iter: (ABOVE_NORMAL) OptimizerHook (NORMAL ) CheckpointHook (LOW ) IterTimerHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- after_train_epoch: (NORMAL ) CheckpointHook (NORMAL ) EvalHook (NORMAL ) BestCkptSaverHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_val_epoch: (LOW ) IterTimerHook (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- before_val_iter: (LOW ) IterTimerHook -------------------- after_val_iter: (LOW ) IterTimerHook -------------------- after_val_epoch: (VERY_LOW ) PreLoggerHook (VERY_LOW ) TextLoggerHook -------------------- after_run: (VERY_LOW ) TextLoggerHook -------------------- 2023-03-06 14:24:12,108 - easycv - INFO - workflow: [('train', 1)], max: 1 epochs 2023-03-06 14:24:12,108 - easycv - INFO - Checkpoints will be saved to /mnt/workspace/work_dir by HardDiskBackend. Cannot get the env variable of GPU_STATUS_FILE, no data report to scheduler. This is not an error. It is because the scheduler of the cluster did not enable this feature. 2023-03-06 14:24:16,489 - easycv - INFO - Epoch [1][20/187] lr: 1.000e-05, eta: 0:00:35, time: 0.215, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0031, loss: 0.0031 2023-03-06 14:24:19,366 - easycv - INFO - Epoch [1][40/187] lr: 1.000e-05, eta: 0:00:26, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0025, loss: 0.0025 2023-03-06 14:24:22,245 - easycv - INFO - Epoch [1][60/187] lr: 1.000e-05, eta: 0:00:21, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042 2023-03-06 14:24:25,124 - easycv - INFO - Epoch [1][80/187] lr: 1.000e-05, eta: 0:00:17, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0034, loss: 0.0034 2023-03-06 14:24:28,002 - easycv - INFO - Epoch [1][100/187] lr: 1.000e-05, eta: 0:00:13, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0035, loss: 0.0035 2023-03-06 14:24:30,880 - easycv - INFO - Epoch [1][120/187] lr: 1.000e-05, eta: 0:00:10, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0037, loss: 0.0037 2023-03-06 14:24:33,760 - easycv - INFO - Epoch [1][140/187] lr: 1.000e-05, eta: 0:00:07, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0042, loss: 0.0042 2023-03-06 14:24:36,643 - easycv - INFO - Epoch [1][160/187] lr: 1.000e-05, eta: 0:00:04, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027 2023-03-06 14:24:39,527 - easycv - INFO - Epoch [1][180/187] lr: 1.000e-05, eta: 0:00:01, time: 0.144, data_time: 0.006, memory: 3436, top1_acc: 1.0000, top5_acc: 1.0000, loss_cls: 0.0027, loss: 0.0027 2023-03-06 14:24:40,490 - easycv - INFO - Saving checkpoint at 1 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 600/600, 134.7 task/s, elapsed: 4s, ETA: 0s 2023-03-06 14:24:45,105 - easycv - INFO - SaveBest metric_name: ['ClsEvaluator_neck_top1'] 2023-03-06 14:24:45,106 - easycv - INFO - End SaveBest metric
模型导出
# 查看训练产生的pt文件 !ls work_dir/*pth
work_dir/epoch_1.pth
!python -m easycv.tools.export \ configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \ work_dir/epoch_1.pth \ work_dir/video_stgcn.pt
[2023-03-06 14:26:12,711.711 dsw-229591-7c4869bc45-tvdsm:6113 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off. configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py /home/pai/lib/python3.6/site-packages/easycv/models/loss/cross_entropy_loss.py:273: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``. 'Default ``avg_non_ignore`` is False, if you would like to ' load checkpoint from local path: work_dir/epoch_1.pth
端到端预测#
预测结果保存为output_video.mp4
文件。
!python skeleton_based_demo.py \ --config=configs/video_recognition/stgcn/stgcn_80e_ntu60_xsub_keypoint.py \ --checkpoint=work_dir/video_stgcn.pt \ --det-config=configs/detection/yolox/yolox_s_8xb16_300e_coco.py \ --det-checkpoint=pretrained_models/yolox_s.pt \ --pose-config=configs/pose/hrnet_w48_coco_256x192_udp.py \ --pose-checkpoint=pretrained_models/pose_hrnet.pt \ --bbox-thr=0.9 \ --out_file=output_video.mp4
[2023-03-06 14:26:23,995.995 dsw-229591-7c4869bc45-tvdsm:6140 INFO utils.py:30] NOTICE: PAIDEBUGGER is turned off. Download video file from remote to local path "{cache_video_path}"... 100%|██████████████████████████████████████| 1.07M/1.07M [00:00<00:00, 5.11MB/s] load checkpoint from local path: pretrained_models/pose_hrnet.pt load checkpoint from local path: work_dir/video_stgcn.pt reparam: 0 load checkpoint from local path: pretrained_models/yolox_s.pt [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 1/1, 4.7 task/s, elapsed: 0s, ETA: 0saction label: hugging other person ] 0/1, elapsed: 0s, ETA:[ ] 0/1, elapsed: 0s, ETA:[ ] 0/1, elapsed: 0s, ETA: Moviepy - Building video output_video.mp4. Moviepy - Writing video output_video.mp4 Moviepy - Done ! Moviepy - video ready output_video.mp4 Write video to output_video.mp4 successfully!
预测结果可视化:
import cv2 from IPython.display import clear_output, Image, display video_path = 'output_video.mp4' video = cv2.VideoCapture(video_path) while True: try: clear_output(wait=True) # 读取视频 ret, frame = video.read() if not ret: break _, ret = cv2.imencode('.jpg', frame) display(Image(data=ret)) except KeyboardInterrupt: video.release()
此外,以上所有模型我们都支持了PAI_Blade推理加速功能,更多的模型和功能请关注开源:https://github.com/alibaba/EasyCV