使用OpenVINO 和 PaddlePaddle 进行图像分类预测

简介: 使用OpenVINO 和 PaddlePaddle 进行图像分类预测

这是使用MobileNetV3 Large PaddePaddle model 和 OpenVINO 进行预测的例子


1.Import各类库


# model download
from pathlib import Path
import os
import urllib.request
import tarfile
# inference
from openvino.runtime import Core
# preprocessing
import cv2
import numpy as np
from openvino.preprocess import PrePostProcessor, ResizeAlgorithm
from openvino.runtime import Layout, Type, AsyncInferQueue, PartialShape
# results visualization
import time
import json
from IPython.display import Image


2.下载 MobileNetV3_large_x1_0 Model


下载预训练模型

Source: github.com/PaddlePaddl…

mobilenet_url = "https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/inference/MobileNetV3_large_x1_0_infer.tar"
mobilenetv3_model_path = Path("model/MobileNetV3_large_x1_0_infer/inference.pdmodel")
if mobilenetv3_model_path.is_file(): 
    print("Model MobileNetV3_large_x1_0 already exists")
else:
    # Download the model from the server, and untar it.
    print("Downloading the MobileNetV3_large_x1_0_infer model (20Mb)... May take a while...")
    # create a directory 
    os.makedirs("model")
    urllib.request.urlretrieve(mobilenet_url, "model/MobileNetV3_large_x1_0_infer.tar")
    print("Model Downloaded")
    file = tarfile.open("model/MobileNetV3_large_x1_0_infer.tar")
    res = file.extractall("model")
    file.close()
    if (not res):
        print(f"Model Extracted to {mobilenetv3_model_path}.")
    else:
        print("Error Extracting the model. Please check the network.")
Model MobileNetV3_large_x1_0 already exists


3.定义callback function for postprocessing


def callback(infer_request, i) -> None:
    """
    Define the callback function for postprocessing
    :param: infer_request: the infer_request object
            i: the iteration of inference
    :retuns:
            None
    """
    imagenet_classes = json.loads(open("utils/imagenet_class_index.json").read())
    predictions = next(iter(infer_request.results.values()))
    indices = np.argsort(-predictions[0])
    if (i == 0):
        # Calculate the first inference time
        latency = time.time() - start
        print(f"latency: {latency}")
        for n in range(5):
            # print(
            #     "class name:","'" + imagenet_classes[str(list(indices)[n])][1] + "'",
            #     ", probability:" , predictions[0][list(indices)[n]])
            print(
                "class name: {}, probability: {:.5f}"
                .format(imagenet_classes[str(list(indices)[n])][1], predictions[0][list(indices)[n]])
            )


4.读取 model file


# Intialize Inference Engine with Core()
ie = Core()
# MobileNetV3_large_x1_0
model = ie.read_model(mobilenetv3_model_path)
# get the information of intput and output layer
input_layer = model.input(0)
output_layer = model.output(0)


5.合并处理步骤


If your input data does not fit perfectly in the model input tensor additional operations/steps are needed to transform the data to a format expected by the model. These operations are known as “preprocessing”. Preprocessing steps are integrated into the execution graph and performed on the selected device(s) (CPU/GPU/VPU/etc.) rather than always executed on CPU. This improves utilization on the selected device(s).

Overview of Preprocessing API: docs.openvino.ai/latest/open…

filename = "../001-hello-world/data/coco.jpg"
test_image = cv2.imread(filename) 
test_image = np.expand_dims(test_image, 0) / 255
_, h, w, _ = test_image.shape
# Adjust model input shape to improve the performance
model.reshape({input_layer.any_name: PartialShape([1, 3, 224, 224])})
ppp = PrePostProcessor(model)
# Set input tensor information:
# - input() provides information about a single model input
# - layout of data is "NHWC"
# - set static spatial dimensions to input tensor to resize from
ppp.input().tensor() \
    .set_spatial_static_shape(h, w) \
    .set_layout(Layout("NHWC")) 
inputs = model.inputs
# Here we assume the model has "NCHW" layout for input
ppp.input().model().set_layout(Layout("NCHW"))
# Do prepocessing:
# - apply linear resize from tensor spatial dims to model spatial dims
# - Subtract mean from each channel
# - Divide each pixel data to appropriate scale value
ppp.input().preprocess() \
    .resize(ResizeAlgorithm.RESIZE_LINEAR, 224, 224) \
    .mean([0.485, 0.456, 0.406]) \
    .scale([0.229, 0.224, 0.225])
# Set output tensor information:
# - precision of tensor is supposed to be 'f32'
ppp.output().tensor().set_element_type(Type.f32)
# Apply preprocessing to modify the original 'model'
model = ppp.build()


6.开始预测


Use “AUTO” as the device name to delegate device selection to OpenVINO. The Auto device plugin internally recognizes and selects devices from among Intel CPU and GPU depending on the device capabilities and the characteristics of the model(s) (for example, precision). Then it assigns inference requests to the best device. AUTO starts inference immediately on the CPU and then transparently shifts to the GPU (or VPU) once it is ready, dramatically reducing time to first inference.

# Check the available devices in your system
devices = ie.available_devices
for device in devices:
    device_name = ie.get_property(device_name=device, name="FULL_DEVICE_NAME")
    print(f"{device}: {device_name}")
# Load model to a device selected by AUTO from the available devices list
compiled_model = ie.compile_model(model=model, device_name="AUTO")
# Create infer request queue
infer_queue = AsyncInferQueue(compiled_model)
infer_queue.set_callback(callback)
start = time.time()
# Do inference
infer_queue.start_async({input_layer.any_name: test_image}, 0)
infer_queue.wait_all()
Image(filename=filename) 
CPU: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
latency: 0.010724067687988281
class name: Labrador_retriever, probability: 0.59148
class name: flat-coated_retriever, probability: 0.11678
class name: Staffordshire_bullterrier, probability: 0.04089
class name: Newfoundland, probability: 0.02689
class name: Tibetan_mastiff, probability: 0.01735


8. Latency and Throughput


Throughput 和 latency 是使用最广的一个测量指标. image.png

docs.openvino.ai/latest/open…


"LATENCY"测试


loop = 100
# AUTO sets device config based on hints
compiled_model = ie.compile_model(model=model, device_name="AUTO", config={"PERFORMANCE_HINT": "LATENCY"})
infer_queue = AsyncInferQueue(compiled_model)
# implement AsyncInferQueue Python API to boost the performance in Async mode
infer_queue.set_callback(callback)
start = time.time()
# run infernce for 100 times to get the average FPS
for i in range(loop):
    infer_queue.start_async({input_layer.any_name: test_image}, i)
infer_queue.wait_all()
end = time.time()
# Calculate the average FPS
fps = loop / (end - start)
print(f"fps: {fps}")
latency: 0.009800195693969727
class name: Labrador_retriever, probability: 0.59148
class name: flat-coated_retriever, probability: 0.11678
class name: Staffordshire_bullterrier, probability: 0.04089
class name: Newfoundland, probability: 0.02689
class name: Tibetan_mastiff, probability: 0.01735
fps: 97.71454464338757


"TRHOUGHPUT"测试


It is possible to define application-specific performance settings with a config key, letting the device adjust to achieve better "THROUGHPUT" performance.

# AUTO sets device config based on hints
compiled_model = ie.compile_model(model=model, device_name="AUTO", config={"PERFORMANCE_HINT": "THROUGHPUT"})
infer_queue = AsyncInferQueue(compiled_model)
infer_queue.set_callback(callback)
start = time.time()
for i in range(loop):
    infer_queue.start_async({input_layer.any_name: test_image}, i)
infer_queue.wait_all()
end = time.time()
# Calculate the average FPS
fps = loop / (end - start)
print(f"fps: {fps}")
latency: 0.01672220230102539
class name: Labrador_retriever, probability: 0.59148
class name: flat-coated_retriever, probability: 0.11678
class name: Staffordshire_bullterrier, probability: 0.04089
class name: Newfoundland, probability: 0.02689
class name: Tibetan_mastiff, probability: 0.01735
fps: 147.6019414195786
!benchmark_app -m $mobilenetv3_model_path -data_shape [1,3,224,224] -hint "latency"
[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading OpenVINO
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[Step 3/11] Setting device configuration
[Step 4/11] Reading network files
[ INFO ] Read model took 66.91 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: ?
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'inputs' precision u8, dimensions ([N,C,H,W]): ? 3 224 224
[ INFO ] Model output 'save_infer_model/scale_0.tmp_1' precision f32, dimensions ([...]): ? 1000
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 192.82 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: CPU
[ INFO ]   AVAILABLE_DEVICES  , ['']
[ INFO ]   RANGE_FOR_ASYNC_INFER_REQUESTS  , (1, 1, 1)
[ INFO ]   RANGE_FOR_STREAMS  , (1, 6)
[ INFO ]   FULL_DEVICE_NAME  , Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
[ INFO ]   OPTIMIZATION_CAPABILITIES  , ['FP32', 'FP16', 'INT8', 'BIN', 'EXPORT_IMPORT']
[ INFO ]   CACHE_DIR  , 
[ INFO ]   NUM_STREAMS  , 1
[ INFO ]   INFERENCE_NUM_THREADS  , 0
[ INFO ]   PERF_COUNT  , False
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS  , 0
[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 1 infer requests took 0.00 ms
[ WARNING ] No input files were given for input 'inputs'!. This input will be filled with random values!
[ INFO ] Fill input 'inputs' with random values 
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, inference only: False, limits: 60000 ms duration)
[ INFO ] Benchmarking in full mode (inputs filling are included in measurement loop).
[ INFO ] First inference took 24.59 ms
[Step 11/11] Dumping statistics report
Count:          9202 iterations
Duration:       60015.90 ms
Latency:
    AVG:        6.41 ms
    MIN:        3.74 ms
    MAX:        14.93 ms
Throughput: 153.33 FPS
[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading OpenVINO
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1
[Step 3/11] Setting device configuration
[Step 4/11] Reading network files
[ INFO ] Read model took 82.23 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: ?
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'inputs' precision u8, dimensions ([N,C,H,W]): ? 3 224 224
[ INFO ] Model output 'save_infer_model/scale_0.tmp_1' precision f32, dimensions ([...]): ? 1000
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 235.02 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: CPU
[ INFO ]   AVAILABLE_DEVICES  , ['']
[ INFO ]   RANGE_FOR_ASYNC_INFER_REQUESTS  , (1, 1, 1)
[ INFO ]   RANGE_FOR_STREAMS  , (1, 6)
[ INFO ]   FULL_DEVICE_NAME  , Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
[ INFO ]   OPTIMIZATION_CAPABILITIES  , ['FP32', 'FP16', 'INT8', 'BIN', 'EXPORT_IMPORT']
[ INFO ]   CACHE_DIR  , 
[ INFO ]   NUM_STREAMS  , 1
[ INFO ]   INFERENCE_NUM_THREADS  , 0
[ INFO ]   PERF_COUNT  , False
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS  , 0
[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 1 infer requests took 0.00 ms
[ WARNING ] No input files were given for input 'inputs'!. This input will be filled with random values!
[ INFO ] Fill input 'inputs' with random values 
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, inference only: False, limits: 60000 ms duration)
[ INFO ] Benchmarking in full mode (inputs filling are included in measurement loop).
[ INFO ] First inference took 27.23 ms
[Step 11/11] Dumping statistics report
Count:          9518 iterations
Duration:       60005.32 ms
Latency:
    AVG:        6.19 ms
    MIN:        3.64 ms
    MAX:        28.95 ms
Throughput: 158.62 FPS


目录
相关文章
|
数据挖掘
基于PaddlePaddle的中风患者线性模型预测
基于PaddlePaddle的中风患者线性模型预测
61 0
|
3月前
|
数据采集 自然语言处理 API
百度飞桨(PaddlePaddle)-数字识别
百度飞桨(PaddlePaddle)-数字识别
55 1
|
3月前
|
文字识别 数据可视化 Python
百度飞桨(PaddlePaddle) - PP-OCRv3 文字检测识别系统 Paddle Inference 模型推理(离线部署)
百度飞桨(PaddlePaddle) - PP-OCRv3 文字检测识别系统 Paddle Inference 模型推理(离线部署)
173 0
|
机器学习/深度学习 数据采集 存储
基于PaddlePaddle的词向量实战 | 深度学习基础任务教程系列
基于PaddlePaddle的词向量实战 | 深度学习基础任务教程系列
|
机器学习/深度学习 自然语言处理 算法
瞎聊深度学习——PaddlePaddle的使用(一)
瞎聊深度学习——PaddlePaddle的使用(一)
|
机器学习/深度学习 编解码 算法
Paddle目标检测学习笔记
Paddle目标检测学习笔记
212 0
Paddle目标检测学习笔记
|
机器学习/深度学习 算法 计算机视觉
Paddle目标检测学习笔记(一)
Paddle目标检测学习笔记(一)
144 0
Paddle目标检测学习笔记(一)
|
编解码 算法 计算机视觉
Paddle目标检测学习笔记(二)
Paddle目标检测学习笔记(二)
174 0
Paddle目标检测学习笔记(二)
|
开发工具 计算机视觉 git
Paddle实现YOLOv3 目标检测
Paddle实现YOLOv3 目标检测
225 0
Paddle实现YOLOv3 目标检测
|
机器学习/深度学习 数据处理
Paddle实现迁移学习
Paddle实现迁移学习
169 0
Paddle实现迁移学习