TensorFlow 1.x 深度学习秘籍：6~10（4）-阿里云开发者社区

TensorFlow 1.x 深度学习秘籍：6~10（4）

2024-01-24 28

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： TensorFlow 1.x 深度学习秘籍：6~10（4）

TensorFlow 1.x 深度学习秘籍：6~10（3）https://developer.aliyun.com/article/1426772

操作步骤

我们进行如下分析：

从这里安装 Android Studio。
按照这个页面上的说明安装 Bazel。对于 macOS，我们将使用 Homebrew：

/usr/bin/ruby -e "$(curl -fsSL \
 https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install bazel
bazel version
brew upgrade bazel

从 GitHub 克隆 TensorFlow 发行版：

git clone https://github.com/TensorFlow/TensorFlow.git

构建图变压器，该图变压器对图本身进行配置：

cd ~/TensorFlow/
bazel build -c opt TensorFlow/tools/benchmark:benchmark_model
INFO: Found 1 target...
Target //TensorFlow/tools/benchmark:benchmark_model up-to-date:
bazel-bin/TensorFlow/tools/benchmark/benchmark_model
INFO: Elapsed time: 0.493s, Critical Path: 0.01s

通过在桌面上运行以下命令来对模型进行基准测试：

bazel-bin/TensorFlow/tools/benchmark/benchmark_model --graph=/Users/gulli/graphs/TensorFlow_inception_graph.pb --show_run_order=false --show_time=false --show_memory=false --show_summary=true --show_flops=true
Graph: [/Users/gulli/graphs/TensorFlow_inception_graph.pb]
Input layers: [input:0]
Input shapes: [1,224,224,3]
Input types: [float]
Output layers: [output:0]
Num runs: [1000]
Inter-inference delay (seconds): [-1.0]
Inter-benchmark delay (seconds): [-1.0]
Num threads: [-1]
Benchmark name: []
Output prefix: []
Show sizes: [0]
Warmup runs: [2]
Loading TensorFlow.
Got config, 0 devices
Running benchmark for max 2 iterations, max -1 seconds without detailed stat logging, with -1s sleep between inferences
count=2 first=279182 curr=41827 min=41827 max=279182 avg=160504 std=118677
Running benchmark for max 1000 iterations, max 10 seconds without detailed stat logging, with -1s sleep between inferences
count=259 first=39945 curr=44189 min=36539 max=51743 avg=38651.1 std=1886
Running benchmark for max 1000 iterations, max 10 seconds with detailed stat logging, with -1s sleep between inferences
count=241 first=40794 curr=39178 min=37634 max=153345 avg=41644.8 std=8092
Average inference timings in us: Warmup: 160504, no stats: 38651, with stats: 41644

通过在运行 64 位 ARM 处理器的目标 android 设备上运行以下命令来对模型进行基准测试。请注意，以下命令将初始图推送到设备上并运行可在其中执行基准测试的外壳程序：

bazel build -c opt --config=android_arm64 \ TensorFlow/tools/benchmark:benchmark_model
adb push bazel-bin/TensorFlow/tools/benchmark/benchmark_model \ /data/local/tmp
adb push /tmp/TensorFlow_inception_graph.pb /data/local/tmp/
adb push ~gulli/graphs/inception5h/TensorFlow_inception_graph.pb /data/local/tmp/
/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb: 1 file pushed. 83.2 MB/s (53884595 bytes in 0.618s)
adb shell
generic_x86:/ $
/data/local/tmp/benchmark_model --graph=/data/local/tmp/TensorFlow_inception_graph.pb --show_run_order=false --show_time=false --show_memory=false --show_summary=true

工作原理

正如预期的那样，该模型在 Conv2D 操作上花费了大量时间。总体而言，这大约占我台式机平均时间的 77.5%。如果在移动设备上运行此程序，那么花时间执行神经网络中的每一层并确保它们处于受控状态至关重要。要考虑的另一个方面是内存占用。在这种情况下，桌面执行约为 10 Mb。

转换移动设备的 TensorFlow 图

在本秘籍中，我们将学习如何转换 TensorFlow 图，以便删除所有仅训练节点。这将减小图的大小，使其更适合于移动设备。

What is a graph transform tool? According to https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md “When you have finished training a model and want to deploy it in production, you’ll often want to modify it to better run in its final environment. For example if you’re targeting a phone you might want to shrink the file size by quantizing the weights, or optimize away batch normalization or other training-only features. The Graph Transform framework offers a suite of tools for modifying computational graphs, and a framework to make it easy to write your own modifications”.

准备

我们将使用 Bazel 构建 TensorFlow 的不同组件。因此，第一步是确保同时安装了 Bazel 和 TensorFlow。

操作步骤

这是我们如何转换 TensorFlow 的方法：

从这里安装 Android Studio。
按照这个页面上的说明安装 Bazel。对于 macOS，我们将使用 Homebrew：

/usr/bin/ruby -e "$(curl -fsSL \
 https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install bazel
bazel version
brew upgrade bazel

从 GitHub 克隆 TensorFlow 发行版：

git clone https://github.com/TensorFlow/TensorFlow.git

构建一个图转换器，它汇总了图本身：

bazel run TensorFlow/tools/graph_transforms:summarize_graph -- --in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb
WARNING: /Users/gulli/TensorFlow/TensorFlow/core/BUILD:1783:1: in includes attribute of cc_library rule //TensorFlow/core:framework_headers_lib: '../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'TensorFlow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /Users/gulli/TensorFlow/TensorFlow/TensorFlow.bzl:1054:30.
INFO: Found 1 target...
Target //TensorFlow/tools/graph_transforms:summarize_graph up-to-date:
bazel-bin/TensorFlow/tools/graph_transforms/summarize_graph
INFO: Elapsed time: 0.395s, Critical Path: 0.01s
INFO: Running command line: bazel-bin/TensorFlow/tools/graph_transforms/summarize_graph '--in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb'
Found 1 possible inputs: (name=input, type=float(1), shape=[])
No variables spotted.
Found 3 possible outputs: (name=output, op=Identity) (name=output1, op=Identity) (name=output2, op=Identity)
Found 13462015 (13.46M) const parameters, 0 (0) variable parameters, and 0 control_edges
370 nodes assigned to device '/cpu:0'Op types used: 142 Const, 64 BiasAdd, 61 Relu, 59 Conv2D, 13 MaxPool, 9 Concat, 5 Reshape, 5 MatMul, 3 Softmax, 3 Identity, 3 AvgPool, 2 LRN, 1 Placeholder
To use with TensorFlow/tools/benchmark:benchmark_model try these arguments:
bazel run TensorFlow/tools/benchmark:benchmark_model -- --graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb --show_flops --input_layer=input --input_layer_type=float --input_layer_shape= --output_layer=output,output1,output2

剥去用于训练的所有节点，当在移动设备上使用图进行推理时，不需要这些节点：

bazel run TensorFlow/tools/graph_transforms:transform_graph -- --in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb --out_graph=/tmp/optimized_inception_graph.pb --transforms="strip_unused_nodes fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms"
WARNING: /Users/gulli/TensorFlow/TensorFlow/core/BUILD:1783:1: in includes attribute of cc_library rule //TensorFlow/core:framework_headers_lib: '../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'TensorFlow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /Users/gulli/TensorFlow/TensorFlow/TensorFlow.bzl:1054:30.
INFO: Found 1 target...
Target //TensorFlow/tools/graph_transforms:transform_graph up-to-date:
bazel-bin/TensorFlow/tools/graph_transforms/transform_graph
INFO: Elapsed time: 0.578s, Critical Path: 0.01s
INFO: Running command line: bazel-bin/TensorFlow/tools/graph_transforms/transform_graph '--in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb' '--out_graph=/tmp/optimized_inception_graph.pb' '--transforms=strip_unused_nodes fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms'
2017-10-15 22:26:59.357129: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying strip_unused_nodes
2017-10-15 22:26:59.367997: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying fold_constants
2017-10-15 22:26:59.387800: I TensorFlow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA
2017-10-15 22:26:59.388676: E TensorFlow/tools/graph_transforms/transform_graph.cc:279] fold_constants: Ignoring error Must specify at least one target to fetch or execute.
2017-10-15 22:26:59.388695: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying fold_batch_norms
2017-10-15 22:26:59.388721: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying fold_old_batch_norms

工作原理

为了创建可以在设备上加载的更轻的模型，我们使用了图变换工具应用的strip_unused_nodes规则删除了所有不需要的节点。该操作将删除用于学习的所有操作，

TensorFlow 1.x 深度学习秘籍：6~10（4）

操作步骤

工作原理

转换移动设备的 TensorFlow 图

准备

操作步骤

工作原理

热门文章

最新文章

相关课程

相关电子书

相关实验场景