TensorFlow 1.x 深度学习秘籍:6~10(3)https://developer.aliyun.com/article/1426772
操作步骤
我们进行如下分析:
/usr/bin/ruby -e "$(curl -fsSL \ https://raw.githubusercontent.com/Homebrew/install/master/install)" brew install bazel bazel version brew upgrade bazel
- 从 GitHub 克隆 TensorFlow 发行版:
git clone https://github.com/TensorFlow/TensorFlow.git
- 构建图变压器,该图变压器对图本身进行配置:
cd ~/TensorFlow/ bazel build -c opt TensorFlow/tools/benchmark:benchmark_model INFO: Found 1 target... Target //TensorFlow/tools/benchmark:benchmark_model up-to-date: bazel-bin/TensorFlow/tools/benchmark/benchmark_model INFO: Elapsed time: 0.493s, Critical Path: 0.01s
- 通过在桌面上运行以下命令来对模型进行基准测试:
bazel-bin/TensorFlow/tools/benchmark/benchmark_model --graph=/Users/gulli/graphs/TensorFlow_inception_graph.pb --show_run_order=false --show_time=false --show_memory=false --show_summary=true --show_flops=true Graph: [/Users/gulli/graphs/TensorFlow_inception_graph.pb] Input layers: [input:0] Input shapes: [1,224,224,3] Input types: [float] Output layers: [output:0] Num runs: [1000] Inter-inference delay (seconds): [-1.0] Inter-benchmark delay (seconds): [-1.0] Num threads: [-1] Benchmark name: [] Output prefix: [] Show sizes: [0] Warmup runs: [2] Loading TensorFlow. Got config, 0 devices Running benchmark for max 2 iterations, max -1 seconds without detailed stat logging, with -1s sleep between inferences count=2 first=279182 curr=41827 min=41827 max=279182 avg=160504 std=118677 Running benchmark for max 1000 iterations, max 10 seconds without detailed stat logging, with -1s sleep between inferences count=259 first=39945 curr=44189 min=36539 max=51743 avg=38651.1 std=1886 Running benchmark for max 1000 iterations, max 10 seconds with detailed stat logging, with -1s sleep between inferences count=241 first=40794 curr=39178 min=37634 max=153345 avg=41644.8 std=8092 Average inference timings in us: Warmup: 160504, no stats: 38651, with stats: 41644
- 通过在运行 64 位 ARM 处理器的目标 android 设备上运行以下命令来对模型进行基准测试。 请注意,以下命令将初始图推送到设备上并运行可在其中执行基准测试的外壳程序:
bazel build -c opt --config=android_arm64 \ TensorFlow/tools/benchmark:benchmark_model adb push bazel-bin/TensorFlow/tools/benchmark/benchmark_model \ /data/local/tmp adb push /tmp/TensorFlow_inception_graph.pb /data/local/tmp/ adb push ~gulli/graphs/inception5h/TensorFlow_inception_graph.pb /data/local/tmp/ /Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb: 1 file pushed. 83.2 MB/s (53884595 bytes in 0.618s) adb shell generic_x86:/ $ /data/local/tmp/benchmark_model --graph=/data/local/tmp/TensorFlow_inception_graph.pb --show_run_order=false --show_time=false --show_memory=false --show_summary=true
工作原理
正如预期的那样,该模型在 Conv2D 操作上花费了大量时间。 总体而言,这大约占我台式机平均时间的 77.5%。 如果在移动设备上运行此程序,那么花时间执行神经网络中的每一层并确保它们处于受控状态至关重要。 要考虑的另一个方面是内存占用。 在这种情况下,桌面执行约为 10 Mb。
转换移动设备的 TensorFlow 图
在本秘籍中,我们将学习如何转换 TensorFlow 图,以便删除所有仅训练节点。 这将减小图的大小,使其更适合于移动设备。
What is a graph transform tool? According to https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md “When you have finished training a model and want to deploy it in production, you’ll often want to modify it to better run in its final environment. For example if you’re targeting a phone you might want to shrink the file size by quantizing the weights, or optimize away batch normalization or other training-only features. The Graph Transform framework offers a suite of tools for modifying computational graphs, and a framework to make it easy to write your own modifications”.
准备
我们将使用 Bazel 构建 TensorFlow 的不同组件。 因此,第一步是确保同时安装了 Bazel 和 TensorFlow。
操作步骤
这是我们如何转换 TensorFlow 的方法:
/usr/bin/ruby -e "$(curl -fsSL \ https://raw.githubusercontent.com/Homebrew/install/master/install)" brew install bazel bazel version brew upgrade bazel
- 从 GitHub 克隆 TensorFlow 发行版:
git clone https://github.com/TensorFlow/TensorFlow.git
- 构建一个图转换器,它汇总了图本身:
bazel run TensorFlow/tools/graph_transforms:summarize_graph -- --in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb WARNING: /Users/gulli/TensorFlow/TensorFlow/core/BUILD:1783:1: in includes attribute of cc_library rule //TensorFlow/core:framework_headers_lib: '../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'TensorFlow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /Users/gulli/TensorFlow/TensorFlow/TensorFlow.bzl:1054:30. INFO: Found 1 target... Target //TensorFlow/tools/graph_transforms:summarize_graph up-to-date: bazel-bin/TensorFlow/tools/graph_transforms/summarize_graph INFO: Elapsed time: 0.395s, Critical Path: 0.01s INFO: Running command line: bazel-bin/TensorFlow/tools/graph_transforms/summarize_graph '--in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb' Found 1 possible inputs: (name=input, type=float(1), shape=[]) No variables spotted. Found 3 possible outputs: (name=output, op=Identity) (name=output1, op=Identity) (name=output2, op=Identity) Found 13462015 (13.46M) const parameters, 0 (0) variable parameters, and 0 control_edges 370 nodes assigned to device '/cpu:0'Op types used: 142 Const, 64 BiasAdd, 61 Relu, 59 Conv2D, 13 MaxPool, 9 Concat, 5 Reshape, 5 MatMul, 3 Softmax, 3 Identity, 3 AvgPool, 2 LRN, 1 Placeholder To use with TensorFlow/tools/benchmark:benchmark_model try these arguments: bazel run TensorFlow/tools/benchmark:benchmark_model -- --graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb --show_flops --input_layer=input --input_layer_type=float --input_layer_shape= --output_layer=output,output1,output2
- 剥去用于训练的所有节点,当在移动设备上使用图进行推理时,不需要这些节点:
bazel run TensorFlow/tools/graph_transforms:transform_graph -- --in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb --out_graph=/tmp/optimized_inception_graph.pb --transforms="strip_unused_nodes fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms" WARNING: /Users/gulli/TensorFlow/TensorFlow/core/BUILD:1783:1: in includes attribute of cc_library rule //TensorFlow/core:framework_headers_lib: '../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'TensorFlow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /Users/gulli/TensorFlow/TensorFlow/TensorFlow.bzl:1054:30. INFO: Found 1 target... Target //TensorFlow/tools/graph_transforms:transform_graph up-to-date: bazel-bin/TensorFlow/tools/graph_transforms/transform_graph INFO: Elapsed time: 0.578s, Critical Path: 0.01s INFO: Running command line: bazel-bin/TensorFlow/tools/graph_transforms/transform_graph '--in_graph=/Users/gulli/graphs/inception5h/TensorFlow_inception_graph.pb' '--out_graph=/tmp/optimized_inception_graph.pb' '--transforms=strip_unused_nodes fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms' 2017-10-15 22:26:59.357129: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying strip_unused_nodes 2017-10-15 22:26:59.367997: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying fold_constants 2017-10-15 22:26:59.387800: I TensorFlow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA 2017-10-15 22:26:59.388676: E TensorFlow/tools/graph_transforms/transform_graph.cc:279] fold_constants: Ignoring error Must specify at least one target to fetch or execute. 2017-10-15 22:26:59.388695: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying fold_batch_norms 2017-10-15 22:26:59.388721: I TensorFlow/tools/graph_transforms/transform_graph.cc:264] Applying fold_old_batch_norms
工作原理
为了创建可以在设备上加载的更轻的模型,我们使用了图变换工具应用的strip_unused_nodes
规则删除了所有不需要的节点。 该操作将删除用于学习的所有操作,