YOLOV5 v6.1更新 | TensorRT+TPU+OpenVINO+TFJS+TFLite等平台一键导出和部署

简介: YOLOV5 v6.1更新 | TensorRT+TPU+OpenVINO+TFJS+TFLite等平台一键导出和部署

1重大更新


  • TensorRT支持:TensorFlow, Keras, TFLite, TF.js模型导出现在完全集成使用python export.py -include saved_model pb TFLite tfjs
  • TensorFlow Edge TPU:新的更小的YOLOv5n(1.9M params)模型低于YOLOv5s(7.5M params),导出到2.1 MB INT8大小,理想的超轻边缘解决方案。
  • OpenVINO支持:YOLOv5 ONNX模型现在兼容OpenCV DNN和ONNX运行。
  • Export Benchmarks:使用python utils/ Benchmark.py导出所有YOLOv5格式(mAP和速度)。目前在CPU上运行,未来的更新将实现GPU支持。
  • 架构:无更改。
  • 超参数:小更改。Yaml LRF从0.2降至0.1。
  • 训练:默认学习速率(LR)调度器更新了一个周期的余弦替换为一个周期的线性,以改善结果。

新版模型导出

1、onnx

def export_onnx(model, im, file, opset, train, dynamic, simplify, prefix=colorstr('ONNX:')):
    # YOLOv5 ONNX export
    try:
        check_requirements(('onnx',))
        import onnx
        LOGGER.info(f'\n{prefix} starting export with onnx {onnx.__version__}...')
        f = file.with_suffix('.onnx')
        torch.onnx.export(model, im, f, verbose=False, opset_version=opset,
                          training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,
                          do_constant_folding=not train,
                          input_names=['images'],
                          output_names=['output'],
                          dynamic_axes={'images': {0: 'batch', 2: 'height', 3: 'width'},  # shape(1,3,640,640)
                                        'output': {0: 'batch', 1: 'anchors'}  # shape(1,25200,85)
                                        } if dynamic else None)
        # Checks
        model_onnx = onnx.load(f)  # load onnx model
        onnx.checker.check_model(model_onnx)  # check onnx model
        # LOGGER.info(onnx.helper.printable_graph(model_onnx.graph))  # print
        # Simplify
        if simplify:
            try:
                check_requirements(('onnx-simplifier',))
                import onnxsim
                LOGGER.info(f'{prefix} simplifying with onnx-simplifier {onnxsim.__version__}...')
                model_onnx, check = onnxsim.simplify(
                    model_onnx,
                    dynamic_input_shape=dynamic,
                    input_shapes={'images': list(im.shape)} if dynamic else None)
                assert check, 'assert check failed'
                onnx.save(model_onnx, f)
            except Exception as e:
                LOGGER.info(f'{prefix} simplifier failure: {e}')
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return f
    except Exception as e:
        LOGGER.info(f'{prefix} export failure: {e}')

2、 openvino

def export_openvino(model, im, file, prefix=colorstr('OpenVINO:')):
    # YOLOv5 OpenVINO export
    try:
        check_requirements(('openvino-dev',))  # requires openvino-dev: https://pypi.org/project/openvino-dev/
        import openvino.inference_engine as ie
        LOGGER.info(f'\n{prefix} starting export with openvino {ie.__version__}...')
        f = str(file).replace('.pt', '_openvino_model' + os.sep)
        cmd = f"mo --input_model {file.with_suffix('.onnx')} --output_dir {f}"
        subprocess.check_output(cmd, shell=True)
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return f
    except Exception as e:
        LOGGER.info(f'\n{prefix} export failure: {e}')

3、 coreml

def export_coreml(model, im, file, prefix=colorstr('CoreML:')):
    # YOLOv5 CoreML export
    try:
        check_requirements(('coremltools',))
        import coremltools as ct
        LOGGER.info(f'\n{prefix} starting export with coremltools {ct.__version__}...')
        f = file.with_suffix('.mlmodel')
        ts = torch.jit.trace(model, im, strict=False)  # TorchScript model
        ct_model = ct.convert(ts, inputs=[ct.ImageType('image', shape=im.shape, scale=1 / 255, bias=[0, 0, 0])])
        ct_model.save(f)
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return ct_model, f
    except Exception as e:
        LOGGER.info(f'\n{prefix} export failure: {e}')
        return None, None

4、TensorRT

def export_engine(model, im, file, train, half, simplify, workspace=4, verbose=False, prefix=colorstr('TensorRT:')):
    # YOLOv5 TensorRT export https://developer.nvidia.com/tensorrt
    try:
        check_requirements(('tensorrt',))
        import tensorrt as trt
        if trt.__version__[0] == '7':  # TensorRT 7 handling https://github.com/ultralytics/yolov5/issues/6012
            grid = model.model[-1].anchor_grid
            model.model[-1].anchor_grid = [a[..., :1, :1, :] for a in grid]
            export_onnx(model, im, file, 12, train, False, simplify)  # opset 12
            model.model[-1].anchor_grid = grid
        else:  # TensorRT >= 8
            check_version(trt.__version__, '8.0.0', hard=True)  # require tensorrt>=8.0.0
            export_onnx(model, im, file, 13, train, False, simplify)  # opset 13
        onnx = file.with_suffix('.onnx')
        LOGGER.info(f'\n{prefix} starting export with TensorRT {trt.__version__}...')
        assert im.device.type != 'cpu', 'export running on CPU but must be on GPU, i.e. `python export.py --device 0`'
        assert onnx.exists(), f'failed to export ONNX file: {onnx}'
        f = file.with_suffix('.engine')  # TensorRT engine file
        logger = trt.Logger(trt.Logger.INFO)
        if verbose:
            logger.min_severity = trt.Logger.Severity.VERBOSE
        builder = trt.Builder(logger)
        config = builder.create_builder_config()
        config.max_workspace_size = workspace * 1 << 30
        flag = (1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
        network = builder.create_network(flag)
        parser = trt.OnnxParser(network, logger)
        if not parser.parse_from_file(str(onnx)):
            raise RuntimeError(f'failed to load ONNX file: {onnx}')
        inputs = [network.get_input(i) for i in range(network.num_inputs)]
        outputs = [network.get_output(i) for i in range(network.num_outputs)]
        LOGGER.info(f'{prefix} Network Description:')
        for inp in inputs:
            LOGGER.info(f'{prefix}\tinput "{inp.name}" with shape {inp.shape} and dtype {inp.dtype}')
        for out in outputs:
            LOGGER.info(f'{prefix}\toutput "{out.name}" with shape {out.shape} and dtype {out.dtype}')
        half &= builder.platform_has_fast_fp16
        LOGGER.info(f'{prefix} building FP{16 if half else 32} engine in {f}')
        if half:
            config.set_flag(trt.BuilderFlag.FP16)
        with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
            t.write(engine.serialize())
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return f
    except Exception as e:
        LOGGER.info(f'\n{prefix} export failure: {e}')

5、TensorFlow

def export_pb(keras_model, im, file, prefix=colorstr('TensorFlow GraphDef:')):
    # YOLOv5 TensorFlow GraphDef *.pb export https://github.com/leimao/Frozen_Graph_TensorFlow
    try:
        import tensorflow as tf
        from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
        LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
        f = file.with_suffix('.pb')
        m = tf.function(lambda x: keras_model(x))  # full model
        m = m.get_concrete_function(tf.TensorSpec(keras_model.inputs[0].shape, keras_model.inputs[0].dtype))
        frozen_func = convert_variables_to_constants_v2(m)
        frozen_func.graph.as_graph_def()
        tf.io.write_graph(graph_or_graph_def=frozen_func.graph, logdir=str(f.parent), name=f.name, as_text=False)
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return f
    except Exception as e:
        LOGGER.info(f'\n{prefix} export failure: {e}')

6、TensorFlow-Lite

def export_tflite(keras_model, im, file, int8, data, ncalib, prefix=colorstr('TensorFlow Lite:')):
    # YOLOv5 TensorFlow Lite export
    try:
        import tensorflow as tf
        LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
        batch_size, ch, *imgsz = list(im.shape)  # BCHW
        f = str(file).replace('.pt', '-fp16.tflite')
        converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
        converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
        converter.target_spec.supported_types = [tf.float16]
        converter.optimizations = [tf.lite.Optimize.DEFAULT]
        if int8:
            from models.tf import representative_dataset_gen
            dataset = LoadImages(check_dataset(data)['train'], img_size=imgsz, auto=False)  # representative data
            converter.representative_dataset = lambda: representative_dataset_gen(dataset, ncalib)
            converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
            converter.target_spec.supported_types = []
            converter.inference_input_type = tf.uint8  # or tf.int8
            converter.inference_output_type = tf.uint8  # or tf.int8
            converter.experimental_new_quantizer = False
            f = str(file).replace('.pt', '-int8.tflite')
        tflite_model = converter.convert()
        open(f, "wb").write(tflite_model)
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return f
    except Exception as e:
        LOGGER.info(f'\n{prefix} export failure: {e}')

7、Egde TPU

def export_edgetpu(keras_model, im, file, prefix=colorstr('Edge TPU:')):
    # YOLOv5 Edge TPU export https://coral.ai/docs/edgetpu/models-intro/
    try:
        cmd = 'edgetpu_compiler --version'
        help_url = 'https://coral.ai/docs/edgetpu/compiler/'
        assert platform.system() == 'Linux', f'export only supported on Linux. See {help_url}'
        if subprocess.run(cmd + ' >/dev/null', shell=True).returncode != 0:
            LOGGER.info(f'\n{prefix} export requires Edge TPU compiler. Attempting install from {help_url}')
            sudo = subprocess.run('sudo --version >/dev/null', shell=True).returncode == 0  # sudo installed on system
            for c in ['curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -',
                      'echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list',
                      'sudo apt-get update',
                      'sudo apt-get install edgetpu-compiler']:
                subprocess.run(c if sudo else c.replace('sudo ', ''), shell=True, check=True)
        ver = subprocess.run(cmd, shell=True, capture_output=True, check=True).stdout.decode().split()[-1]
        LOGGER.info(f'\n{prefix} starting export with Edge TPU compiler {ver}...')
        f = str(file).replace('.pt', '-int8_edgetpu.tflite')  # Edge TPU model
        f_tfl = str(file).replace('.pt', '-int8.tflite')  # TFLite model
        cmd = f"edgetpu_compiler -s {f_tfl}"
        subprocess.run(cmd, shell=True, check=True)
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return f
    except Exception as e:
        LOGGER.info(f'\n{prefix} export failure: {e}')

8、TensorFlow.js

def export_tfjs(keras_model, im, file, prefix=colorstr('TensorFlow.js:')):
    # YOLOv5 TensorFlow.js export
    try:
        check_requirements(('tensorflowjs',))
        import re
        import tensorflowjs as tfjs
        LOGGER.info(f'\n{prefix} starting export with tensorflowjs {tfjs.__version__}...')
        f = str(file).replace('.pt', '_web_model')  # js dir
        f_pb = file.with_suffix('.pb')  # *.pb path
        f_json = f + '/model.json'  # *.json path
        cmd = f'tensorflowjs_converter --input_format=tf_frozen_model ' \
              f'--output_node_names="Identity,Identity_1,Identity_2,Identity_3" {f_pb} {f}'
        subprocess.run(cmd, shell=True)
        json = open(f_json).read()
        with open(f_json, 'w') as j:  # sort JSON Identity_* in ascending order
            subst = re.sub(
                r'{"outputs": {"Identity.?.?": {"name": "Identity.?.?"}, '
                r'"Identity.?.?": {"name": "Identity.?.?"}, '
                r'"Identity.?.?": {"name": "Identity.?.?"}, '
                r'"Identity.?.?": {"name": "Identity.?.?"}}}',
                r'{"outputs": {"Identity": {"name": "Identity"}, '
                r'"Identity_1": {"name": "Identity_1"}, '
                r'"Identity_2": {"name": "Identity_2"}, '
                r'"Identity_3": {"name": "Identity_3"}}}',
                json)
            j.write(subst)
        LOGGER.info(f'{prefix} export success, saved as {f} ({file_size(f):.1f} MB)')
        return f
    except Exception as e:
        LOGGER.info(f'\n{prefix} export failure: {e}')

新版检测推理

python path/to/detect.py --weights yolov5s.pt                 # PyTorch
                                   yolov5s.torchscript        # TorchScript
                                   yolov5s.onnx               # ONNX Runtime or OpenCV DNN with --dnn
                                   yolov5s.xml                # OpenVINO
                                   yolov5s.engine             # TensorRT
                                   yolov5s.mlmodel            # CoreML (MacOS-only)
                                   yolov5s_saved_model        # TensorFlow SavedModel
                                   yolov5s.pb                 # TensorFlow GraphDef
                                   yolov5s.tflite             # TensorFlow Lite
                                   yolov5s_edgetpu.tflite     # TensorFlow Edge TPU


2最新结果



3与6.0版本的精度对比



4参考


[1].https://github.com/ultralytics/yolov5

相关文章
|
3月前
|
机器学习/深度学习 算法 PyTorch
PINN物理信息神经网络多变量时序预测研究(Matlab代码实现)
PINN物理信息神经网络多变量时序预测研究(Matlab代码实现)
514 0
|
Ubuntu Linux 编译器
Linux/Ubuntu下使用VS Code配置C/C++项目环境调用OpenCV
通过以上步骤,您已经成功在Ubuntu系统下的VS Code中配置了C/C++项目环境,并能够调用OpenCV库进行开发。请确保每一步都按照您的系统实际情况进行适当调整。
2349 3
|
安全 网络安全 开发者
Docker学习笔记(一):Docker命令总结
本文介绍了Docker的基本概念、优点以及常用命令。Docker是一个开源的应用容器引擎,它通过容器打包应用程序及其依赖项,实现快速部署和迁移。主要优点包括轻量级、可移植性、易于管理、安全性和开源性。文章还区分了镜像和容器的概念,并提供了构建镜像、查看容器、运行容器、停止和删除容器等常用Docker命令的示例。
359 0
|
算法 测试技术 AI芯片
CPU反超NPU,llama.cpp生成速度翻5倍!LLM端侧部署新范式T-MAC开源
【9月更文挑战第7天】微软研究院提出了一种名为T-MAC的创新方法,旨在解决大型语言模型在资源受限的边缘设备上高效部署的问题。T-MAC通过查表法在CPU上实现低比特LLM的高效推理,支持混合精度矩阵乘法,无需解量化。其通过位级查表实现统一且可扩展的解决方案,优化数据布局和重用率,显著提升了单线程和多线程下的mpGEMV及mpGEMM性能,并在端到端推理吞吐量和能效方面表现出色。然而,表量化和快速聚合技术可能引入近似和数值误差,影响模型准确性。论文详见:[链接](https://www.arxiv.org/pdf/2407.00088)。
708 10
|
XML 编解码 数据挖掘
YOLOv5源码逐行超详细注释与解读(5)——配置文件yolov5s.yaml
YOLOv5源码逐行超详细注释与解读(5)——配置文件yolov5s.yaml
3211 1
YOLOv5源码逐行超详细注释与解读(5)——配置文件yolov5s.yaml
|
缓存 搜索推荐 前端开发
《VitePress 简易速速上手小册》第6章 性能优化与 SEO(2024 最新版)
《VitePress 简易速速上手小册》第6章 性能优化与 SEO(2024 最新版)
643 1
|
机器学习/深度学习 并行计算 自动驾驶
手把手教学!TensorRT部署实战:YOLOv5的ONNX模型部署
手把手教学!TensorRT部署实战:YOLOv5的ONNX模型部署
手把手教学!TensorRT部署实战:YOLOv5的ONNX模型部署
|
Python C++ PyTorch
PyTorch 2.2 中文官方教程(十一)(2)
PyTorch 2.2 中文官方教程(十一)
258 1
|
计算机视觉 C++
Mac系统下vscode调试opencv环境搭建
Mac系统下vscode调试opencv环境搭建
354 1
|
并行计算 安全 Linux
如何设置环境变量KMP_DUPLICATE_LIB_OK=TRUE
【5月更文挑战第25天】如何设置环境变量KMP_DUPLICATE_LIB_OK=TRUE
1367 0