TensorFlow的简介

TensorFlow是一个机器学习框架，其整体架构设计主要分成Client，Master和Worker。解耦的架构使得它具有高度灵活性，使它可以方便地在机器集群上部署。

TensorFlow的代码架构

TensorFlow整体架构如下（图片来自官网）。

Client

Client是算法工程师直接接触使用的。有Python，C++，Java等不同的版本。它的主要作用是：

将计算过程定义成计算图。机器学习主要存在命令式和声明式两种不同的编程模型。命令式编程模型就是我们一般的编程方式。声明式模型类似于RxJava那样，先构建一个数据通道，等事件触发时，才会真正有数据喂入，并执行。TensorFlow就是声明式的编程模型。算法工程师利用Client的API，构建一个计算图。
提供Session接口执行计算图。

Distributed Master

将计算图切分成更小的子计算图。
将子计算图进一步切分成更小的计算片段，使之能够并行运行在不同的进程乃至不同的设备上。
将计算片段分发给不同的Worker。
触发Worker执行分配到的计算任务。

Worker Services

调用TensorFlow内核，根据可用的硬件情况执行计算片段。
和其他Worker进行交互，发送和接收计算结果。

Kernel Implementations

提供细粒度，独立的计算功能（operation），例如加法，减法，字符串切割。

移动端的TensorFlow

在端侧直接执行模型有节省带宽，响应及时，不受网络好坏通断影响更加稳定，无需数据传输更加安全等优点。因此端侧执行模型是有需求的。在移动设备或者其他嵌入式设备上执行TensorFlow，其关注点和云端就有所不同。需要着重注意更低的功耗，更快的速度，更小的size。当前针对移动设备，有TensorFlow Mobile和TensorFlow Lite两种解决方案。TensorFlow Mobile比较早出来，比较稳定，但性能等方面没有针对移动端作过多优化，目前已不推荐使用，预计到2019年初就会被废弃。
根据官网的介绍，TensorFlow Mobile和TensorFlow Lite的主要区别是：

TensorFlow Lite是TensorFlow Mobile的进化版。在大多数情况下，TensorFlow Lite拥有跟小的二进制大小，更少的依赖以及更好的性能。
TensorFlow Lite尚在开发阶段，可能存在一些功能尚未补齐。不过官方承诺正在加大力度开发。
TensorFlow Lite支持的OP比较有限，相比之下TensorFlow Mobile更加全面。

从源码看区别

以上是官网的介绍，然而看这介绍依然比较模糊。TensorFlow Mobile到底精简了啥，它支持哪些OP？TensorFlow Lite在实现上到底有何区别？为搞清这些问题，只有分析源码了。

TensorFlow 代码目录介绍

Tensorflow/core目录包含了TF核心模块代码。
public: API接口头文件目录，用于外部接口调用的API定义，主要是session.h 和tensor_c_api.h。
client: API接口实现文件目录。
platform: OS系统相关接口文件，如file system, env等。
protobuf: 均为.proto文件，用于数据传输时的结构序列化.
common_runtime: 公共运行库，包含session, executor, threadpool, rendezvous, memory管理, 设备分配算法等。
distributed_runtime: 分布式执行模块，如rpc session, rpc master, rpc worker, graph manager。
framework: 包含基础功能模块，如log, memory, tensor
graph: 计算流图相关操作，如construct, partition, optimize, execute等
kernels: 核心Op，如matmul, conv2d, argmax, batch_norm等
lib: 公共基础库，如gif、gtl(google模板库)、hash、histogram等。
ops: 基本ops运算，ops梯度运算，io相关的ops，控制流和数据流操作

Tensorflow/stream_executor目录是并行计算框架，由google stream executor团队开发。
Tensorflow/contrib目录是contributor开发目录，其中android目录下是android版本的TensorFlow mobile。lite目录下正是TensorFlow lite的源码。
Tensroflow/python目录是python API客户端脚本。
Tensorflow/tensorboard目录是可视化分析工具，不仅可以模型可视化，还可以监控模型参数变化。
third_party目录是TF第三方依赖库。
eigen3: eigen矩阵运算库，TF基础ops调用
gpus: 封装了cuda/cudnn编程库

TensorFlow Mobile精简了啥？

TensorFlow采用bazel进行编译，因此我们可以通过查看编译文件来分析区别。

TensorFlow默认的编译配置

===== /tensorflow/BUILD ===== 
tf_cc_shared_object(
    name = "libtensorflow.so",
    linkopts = select({
        "//tensorflow:darwin": [
            "-Wl,-exported_symbols_list",  # This line must be directly followed by the exported_symbols.lds file
            "$(location //tensorflow/c:exported_symbols.lds)",
            "-Wl,-install_name,@rpath/libtensorflow.so",
        ],
        "//tensorflow:windows": [],
        "//conditions:default": [
            "-z defs",
            "-Wl,--version-script",  #  This line must be directly followed by the version_script.lds file
            "$(location //tensorflow/c:version_script.lds)",
        ],
    }),
    visibility = ["//visibility:public"],
    deps = [
        "//tensorflow/c:c_api",
        "//tensorflow/c:c_api_experimental",
        "//tensorflow/c:exported_symbols.lds",
        "//tensorflow/c:version_script.lds",
        "//tensorflow/c/eager:c_api",
        "//tensorflow/core:tensorflow",
    ],
)

===== /tensorflow/c/BUILD ===== 
tf_cuda_library(
    name = "c_api",
    srcs = [
        "c_api.cc",
        "c_api_function.cc",
    ],
    hdrs = [
        "c_api.h",
    ],
    copts = tf_copts(),
    visibility = ["//visibility:public"],
    deps = select({
        "//tensorflow:android": [
            ":c_api_internal",
            "//tensorflow/core:android_tensorflow_lib_lite",
        ],
        "//conditions:default": [
            ":c_api_internal",
            "//tensorflow/cc/saved_model:loader",
            "//tensorflow/cc:gradients",
            "//tensorflow/cc:ops",
            "//tensorflow/cc:grad_ops",
            "//tensorflow/cc:scope_internal",
            "//tensorflow/cc:while_loop",
            "//tensorflow/core:core_cpu",
            "//tensorflow/core:core_cpu_internal",
            "//tensorflow/core:framework",
            "//tensorflow/core:op_gen_lib",
            "//tensorflow/core:protos_all_cc",
            "//tensorflow/core:lib",
            "//tensorflow/core:lib_internal",
        ],
    }) + select({
        "//tensorflow:with_xla_support": [
            "//tensorflow/compiler/tf2xla:xla_compiler",
            "//tensorflow/compiler/jit",
        ],
        "//conditions:default": [],
    }),
)


tf_cuda_library(
    name = "c_api_experimental",
    srcs = [
        "c_api_experimental.cc",
    ],
    hdrs = [
        "c_api_experimental.h",
    ],
    copts = tf_copts(),
    visibility = ["//visibility:public"],
    deps = [
        ":c_api",
        ":c_api_internal",
        "//tensorflow/c/eager:c_api",
        "//tensorflow/compiler/jit/legacy_flags:mark_for_compilation_pass_flags",
        "//tensorflow/contrib/tpu:all_ops",
        "//tensorflow/core:core_cpu",
        "//tensorflow/core:framework",
        "//tensorflow/core:lib",
        "//tensorflow/core:lib_platform",
        "//tensorflow/core:protos_all_cc",
    ],
)


===== /tensorflow/c/eager/BUILD ===== 
tf_cuda_library(
    name = "c_api",
    srcs = [
        "c_api.cc",
        "c_api_debug.cc",
        "c_api_internal.h",
    ],
    hdrs = ["c_api.h"],
    copts = tf_copts() + tfe_xla_copts(),
    visibility = ["//visibility:public"],
    deps = select({
        "//tensorflow:android": [
            "//tensorflow/core:android_tensorflow_lib_lite",
        ],
        "//conditions:default": [
            "//tensorflow/c:c_api",
            "//tensorflow/c:c_api_internal",
            "//tensorflow/core:core_cpu",
            "//tensorflow/core/common_runtime/eager:attr_builder",
            "//tensorflow/core/common_runtime/eager:context",
            "//tensorflow/core/common_runtime/eager:eager_executor",
            "//tensorflow/core/common_runtime/eager:execute",
            "//tensorflow/core/common_runtime/eager:kernel_and_device",
            "//tensorflow/core/common_runtime/eager:tensor_handle",
            "//tensorflow/core/common_runtime/eager:copy_to_device_node",
            "//tensorflow/core:core_cpu_internal",
            "//tensorflow/core:framework",
            "//tensorflow/core:framework_internal",
            "//tensorflow/core:lib",
            "//tensorflow/core:lib_internal",
            "//tensorflow/core:protos_all_cc",
        ],
    }) + select({
        "//tensorflow:with_xla_support": [
            "//tensorflow/compiler/tf2xla:xla_compiler",
            "//tensorflow/compiler/jit",
            "//tensorflow/compiler/jit:xla_device",
        ],
        "//conditions:default": [],
    }) + [
        "//tensorflow/core/common_runtime/eager:eager_operation",
        "//tensorflow/core/distributed_runtime/eager:eager_client",
        "//tensorflow/core/distributed_runtime/rpc/eager:grpc_eager_client",
        "//tensorflow/core/distributed_runtime/rpc:grpc_channel",
        "//tensorflow/core/distributed_runtime/rpc:grpc_server_lib",
        "//tensorflow/core/distributed_runtime/rpc:grpc_worker_cache",
        "//tensorflow/core/distributed_runtime/rpc:grpc_worker_service",
        "//tensorflow/core/distributed_runtime/rpc:rpc_rendezvous_mgr",
        "//tensorflow/core/distributed_runtime:remote_device",
        "//tensorflow/core/distributed_runtime:server_lib",
        "//tensorflow/core/distributed_runtime:worker_env",
        "//tensorflow/core:gpu_runtime",
    ],
)

===== /tensorflow/core/BUILD ===== 
cc_library(
    name = "tensorflow",
    visibility = ["//visibility:public"],
    deps = [
        ":tensorflow_opensource",
        "//tensorflow/core/platform/default/build_config:tensorflow_platform_specific",
    ],
)


tf_cuda_library(
    name = "tensorflow_opensource",
    copts = tf_copts(),
    visibility = ["//visibility:public"],
    deps = [
        ":all_kernels",
        ":core",
        ":direct_session",
        ":example_parser_configuration",
        ":gpu_runtime",
        ":lib",
    ],
)


cc_library(
    name = "all_kernels",
    visibility = ["//visibility:public"],
    deps = if_dynamic_kernels(
        [],
        otherwise = [":all_kernels_statically_linked"],
    ),
)


# This is a link-only library to provide a DirectSession
# implementation of the Session interface.
tf_cuda_library(
    name = "direct_session",
    copts = tf_copts(),
    linkstatic = 1,
    visibility = ["//visibility:public"],
    deps = [
        ":direct_session_internal",
    ],
    alwayslink = 1,
)

filegroup(
    name = "example_parser_configuration_testdata",
    srcs = [
        "example/testdata/parse_example_graph_def.pbtxt",
    ],
)

cc_library(
    name = "core",
    visibility = ["//visibility:public"],
    deps = [
        ":core_cpu",
        ":gpu_runtime",
        ":sycl_runtime",
    ],
)


cc_library(
    name = "lib",
    hdrs = [
        "lib/bfloat16/bfloat16.h",
        "lib/core/arena.h",
        "lib/core/bitmap.h",
        "lib/core/bits.h",
        "lib/core/casts.h",
        "lib/core/coding.h",
        "lib/core/errors.h",
        "lib/core/notification.h",
        "lib/core/raw_coding.h",
        "lib/core/status.h",
        "lib/core/stringpiece.h",
        "lib/core/threadpool.h",
        "lib/gtl/array_slice.h",
        "lib/gtl/cleanup.h",
        "lib/gtl/compactptrset.h",
        "lib/gtl/flatmap.h",
        "lib/gtl/flatset.h",
        "lib/gtl/inlined_vector.h",
        "lib/gtl/optional.h",
        "lib/gtl/priority_queue_util.h",
        "lib/hash/crc32c.h",
        "lib/hash/hash.h",
        "lib/histogram/histogram.h",
        "lib/io/buffered_inputstream.h",
        "lib/io/compression.h",
        "lib/io/inputstream_interface.h",
        "lib/io/path.h",
        "lib/io/proto_encode_helper.h",
        "lib/io/random_inputstream.h",
        "lib/io/record_reader.h",
        "lib/io/record_writer.h",
        "lib/io/table.h",
        "lib/io/table_builder.h",
        "lib/io/table_options.h",
        "lib/math/math_util.h",
        "lib/monitoring/collected_metrics.h",
        "lib/monitoring/collection_registry.h",
        "lib/monitoring/counter.h",
        "lib/monitoring/gauge.h",
        "lib/monitoring/metric_def.h",
        "lib/monitoring/sampler.h",
        "lib/random/distribution_sampler.h",
        "lib/random/philox_random.h",
        "lib/random/random_distributions.h",
        "lib/random/simple_philox.h",
        "lib/strings/numbers.h",
        "lib/strings/proto_serialization.h",
        "lib/strings/str_util.h",
        "lib/strings/strcat.h",
        "lib/strings/stringprintf.h",
        ":platform_base_hdrs",
        ":platform_env_hdrs",
        ":platform_file_system_hdrs",
        ":platform_other_hdrs",
        ":platform_port_hdrs",
        ":platform_protobuf_hdrs",
    ],
    visibility = ["//visibility:public"],
    deps = [
        ":lib_internal",
        "@com_google_absl//absl/container:inlined_vector",
        "@com_google_absl//absl/strings",
        "@com_google_absl//absl/types:optional",
    ],
)

# This includes implementations of all kernels built into TensorFlow.
cc_library(
    name = "all_kernels_statically_linked",
    visibility = ["//visibility:private"],
    deps = [
        "//tensorflow/core/kernels:array",
        "//tensorflow/core/kernels:audio",
        "//tensorflow/core/kernels:batch_kernels",
        "//tensorflow/core/kernels:bincount_op",
        "//tensorflow/core/kernels:boosted_trees_ops",
        "//tensorflow/core/kernels:candidate_sampler_ops",
        "//tensorflow/core/kernels:checkpoint_ops",
        "//tensorflow/core/kernels:collective_ops",
        "//tensorflow/core/kernels:control_flow_ops",
        "//tensorflow/core/kernels:ctc_ops",
        "//tensorflow/core/kernels:cudnn_rnn_kernels",
        "//tensorflow/core/kernels:data_flow",
        "//tensorflow/core/kernels:dataset_ops",
        "//tensorflow/core/kernels:decode_proto_op",
        "//tensorflow/core/kernels:encode_proto_op",
        "//tensorflow/core/kernels:fake_quant_ops",
        "//tensorflow/core/kernels:function_ops",
        "//tensorflow/core/kernels:functional_ops",
        "//tensorflow/core/kernels:grappler",
        "//tensorflow/core/kernels:histogram_op",
        "//tensorflow/core/kernels:image",
        "//tensorflow/core/kernels:io",
        "//tensorflow/core/kernels:linalg",
        "//tensorflow/core/kernels:list_kernels",
        "//tensorflow/core/kernels:lookup",
        "//tensorflow/core/kernels:logging",
        "//tensorflow/core/kernels:manip",
        "//tensorflow/core/kernels:math",
        "//tensorflow/core/kernels:multinomial_op",
        "//tensorflow/core/kernels:nn",
        "//tensorflow/core/kernels:parameterized_truncated_normal_op",
        "//tensorflow/core/kernels:parsing",
        "//tensorflow/core/kernels:partitioned_function_ops",
        "//tensorflow/core/kernels:random_ops",
        "//tensorflow/core/kernels:random_poisson_op",
        "//tensorflow/core/kernels:remote_fused_graph_ops",
        "//tensorflow/core/kernels:required",
        "//tensorflow/core/kernels:resource_variable_ops",
        "//tensorflow/core/kernels:rpc_op",
        "//tensorflow/core/kernels:scoped_allocator_ops",
        "//tensorflow/core/kernels:sdca_ops",
        "//tensorflow/core/kernels:searchsorted_op",
        "//tensorflow/core/kernels:set_kernels",
        "//tensorflow/core/kernels:sparse",
        "//tensorflow/core/kernels:state",
        "//tensorflow/core/kernels:stateless_random_ops",
        "//tensorflow/core/kernels:string",
        "//tensorflow/core/kernels:summary_kernels",
        "//tensorflow/core/kernels:training_ops",
        "//tensorflow/core/kernels:word2vec_kernels",
    ] + tf_additional_cloud_kernel_deps() + if_not_windows([
        "//tensorflow/core/kernels:fact_op",
        "//tensorflow/core/kernels:array_not_windows",
        "//tensorflow/core/kernels:math_not_windows",
        "//tensorflow/core/kernels:quantized_ops",
        "//tensorflow/core/kernels/neon:neon_depthwise_conv_op",
    ]) + if_mkl([
        "//tensorflow/core/kernels:mkl_concat_op",
        "//tensorflow/core/kernels:mkl_conv_op",
        "//tensorflow/core/kernels:mkl_cwise_ops_common",
        "//tensorflow/core/kernels:mkl_fused_batch_norm_op",
        "//tensorflow/core/kernels:mkl_identity_op",
        "//tensorflow/core/kernels:mkl_input_conversion_op",
        "//tensorflow/core/kernels:mkl_lrn_op",
        "//tensorflow/core/kernels:mkl_pooling_ops",
        "//tensorflow/core/kernels:mkl_relu_op",
        "//tensorflow/core/kernels:mkl_reshape_op",
        "//tensorflow/core/kernels:mkl_slice_op",
        "//tensorflow/core/kernels:mkl_softmax_op",
        "//tensorflow/core/kernels:mkl_transpose_op",
        "//tensorflow/core/kernels:mkl_tfconv_op",
        "//tensorflow/core/kernels:mkl_aggregate_ops",
    ]) + if_cuda([
        "//tensorflow/core/grappler/optimizers:gpu_swapping_kernels",
        "//tensorflow/core/grappler/optimizers:gpu_swapping_ops",
    ]),
)

TensorFlow Mobile的编译配置

===== tensorflow/contrib/android/BUILD =====
cc_binary(
    name = "libtensorflow_inference.so",
    srcs = [],
    copts = tf_copts() + [
        "-ffunction-sections",
        "-fdata-sections",
    ],
    linkopts = if_android([
        "-landroid",
        "-latomic",
        "-ldl",
        "-llog",
        "-lm",
        "-z defs",
        "-s",
        "-Wl,--gc-sections",
        "-Wl,--version-script",  # This line must be directly followed by LINKER_SCRIPT.
        "$(location {})".format(LINKER_SCRIPT),
    ]),
    linkshared = 1,
    linkstatic = 1,
    tags = [
        "manual",
        "notap",
    ],
    deps = [
        ":android_tensorflow_inference_jni",
        "//tensorflow/core:android_tensorflow_lib",
        LINKER_SCRIPT,
    ],
)


cc_library(
    name = "android_tensorflow_inference_jni",
    srcs = if_android([":android_tensorflow_inference_jni_srcs"]),
    copts = tf_copts(),
    visibility = ["//visibility:public"],
    deps = [
        "//tensorflow/core:android_tensorflow_lib_lite",
        "//tensorflow/java/src/main/native",
    ],
    alwayslink = 1,
)


===== tensorflow/core/BUILD ===== 
cc_library(
    name = "android_tensorflow_lib",
    srcs = if_android([":android_op_registrations_and_gradients"]),
    copts = tf_copts(),
    tags = [
        "manual",
        "notap",
    ],
    visibility = ["//visibility:public"],
    deps = [
        ":android_tensorflow_lib_lite",
        ":protos_all_cc_impl",
        "//tensorflow/core/kernels:android_tensorflow_kernels",
        "//third_party/eigen3",
        "@protobuf_archive//:protobuf",
    ],
    alwayslink = 1,
)


cc_library(
    name = "android_tensorflow_lib_lite",
    srcs = if_android(["//tensorflow/core:android_srcs"]),
    copts = tf_copts(android_optimization_level_override = None),
    linkopts = ["-lz"],
    tags = [
        "manual",
        "notap",
    ],
    visibility = ["//visibility:public"],
    deps = [
        ":mobile_additional_lib_deps",
        ":protos_all_cc_impl",
        ":stats_calculator_portable",
        "//third_party/eigen3",
        "@double_conversion//:double-conversion",
        "@nsync//:nsync_cpp",
        "@protobuf_archive//:protobuf",
    ],
    alwayslink = 1,
)

alias(
    name = "android_srcs",
    actual = ":mobile_srcs",
    visibility = ["//visibility:public"],
)

filegroup(
    name = "mobile_srcs",
    srcs = [
        ":mobile_srcs_no_runtime",
        ":mobile_srcs_only_runtime",
    ],
    visibility = ["//visibility:public"],
)

# Core sources for Android builds.
filegroup(
    name = "mobile_srcs_no_runtime",
    srcs = [
        ":protos_all_proto_text_srcs",
        ":error_codes_proto_text_srcs",
        "//tensorflow/core/platform/default/build_config:android_srcs",
    ] + glob(
        [
            "client/**/*.cc",
            "framework/**/*.h",
            "framework/**/*.cc",
            "lib/**/*.h",
            "lib/**/*.cc",
            "platform/**/*.h",
            "platform/**/*.cc",
            "public/**/*.h",
            "util/**/*.h",
            "util/**/*.cc",
        ],
        exclude = [
            "**/*test.*",
            "**/*testutil*",
            "**/*testlib*",
            "**/*main.cc",
            "debug/**/*",
            "framework/op_gen_*",
            "lib/jpeg/**/*",
            "lib/png/**/*",
            "lib/gif/**/*",
            "util/events_writer.*",
            "util/stats_calculator.*",
            "util/reporter.*",
            "platform/**/cuda_libdevice_path.*",
            "platform/default/test_benchmark.*",
            "platform/cuda.h",
            "platform/google/**/*",
            "platform/hadoop/**/*",
            "platform/gif.h",
            "platform/jpeg.h",
            "platform/png.h",
            "platform/stream_executor.*",
            "platform/windows/**/*",
            "user_ops/**/*.cu.cc",
            "util/ctc/*.h",
            "util/ctc/*.cc",
            "util/tensor_bundle/*.h",
            "util/tensor_bundle/*.cc",
            "common_runtime/gpu/**/*",
            "common_runtime/eager/*",
            "common_runtime/gpu_device_factory.*",
        ],
    ),
    visibility = ["//visibility:public"],
)

filegroup(
    name = "mobile_srcs_only_runtime",
    srcs = [
        "//tensorflow/core/kernels:android_srcs",
        "//tensorflow/core/util/ctc:android_srcs",
        "//tensorflow/core/util/tensor_bundle:android_srcs",
    ] + glob(
        [
            "common_runtime/**/*.h",
            "common_runtime/**/*.cc",
            "graph/**/*.h",
            "graph/**/*.cc",
        ],
        exclude = [
            "**/*test.*",
            "**/*testutil*",
            "**/*testlib*",
            "**/*main.cc",
            "common_runtime/gpu/**/*",
            "common_runtime/eager/*",
            "common_runtime/gpu_device_factory.*",
            "graph/dot.*",
        ],
    ),
    visibility = ["//visibility:public"],
)

cc_library(
    name = "stats_calculator_portable",
    srcs = [
        "util/stat_summarizer_options.h",
        "util/stats_calculator.cc",
    ],
    hdrs = [
        "util/stats_calculator.h",
    ],
    copts = tf_copts(),
)

cc_library(
    name = "mobile_additional_lib_deps",
    deps = tf_additional_lib_deps() + [
        "@com_google_absl//absl/strings",
    ],
)


===== tensorflow/core/kernels/BUILD ===== 
cc_library(
    name = "android_tensorflow_kernels",
    srcs = select({
        "//tensorflow:android": [
            "//tensorflow/core/kernels:android_core_ops",
            "//tensorflow/core/kernels:android_extended_ops",
        ],
        "//conditions:default": [],
    }),
    copts = tf_copts(),
    linkopts = select({
        "//tensorflow:android": [
            "-ldl",
        ],
        "//conditions:default": [],
    }),
    tags = [
        "manual",
        "notap",
    ],
    visibility = ["//visibility:public"],
    deps = [
        "//tensorflow/core:android_tensorflow_lib_lite",
        "//tensorflow/core:protos_all_cc_impl",
        "//third_party/eigen3",
        "//third_party/fft2d:fft2d_headers",
        "@fft2d",
        "@gemmlowp",
        "@protobuf_archive//:protobuf",
    ],
    alwayslink = 1,
)


# Core kernels we want on Android. Only a subset of kernels to keep
# base library small.
filegroup(
    name = "android_core_ops",
    srcs = [
        "aggregate_ops.cc",
        "aggregate_ops.h",
        "aggregate_ops_cpu.h",
        "assign_op.h",
        "bias_op.cc",
        "bias_op.h",
        "bounds_check.h",
        "cast_op.cc",
        "cast_op.h",
        "cast_op_impl.h",
        "cast_op_impl_bfloat.cc",
        "cast_op_impl_bool.cc",
        "cast_op_impl_complex128.cc",
        "cast_op_impl_complex64.cc",
        "cast_op_impl_double.cc",
        "cast_op_impl_float.cc",
        "cast_op_impl_half.cc",
        "cast_op_impl_int16.cc",
        "cast_op_impl_int32.cc",
        "cast_op_impl_int64.cc",
        "cast_op_impl_int8.cc",
        "cast_op_impl_uint16.cc",
        "cast_op_impl_uint32.cc",
        "cast_op_impl_uint64.cc",
        "cast_op_impl_uint8.cc",
        "concat_lib.h",
        "concat_lib_cpu.cc",
        "concat_lib_cpu.h",
        "concat_op.cc",
        "constant_op.cc",
        "constant_op.h",
        "cwise_ops.h",
        "cwise_ops_common.cc",
        "cwise_ops_common.h",
        "cwise_ops_gradients.h",
        "dense_update_functor.cc",
        "dense_update_functor.h",
        "dense_update_ops.cc",
        "example_parsing_ops.cc",
        "fill_functor.cc",
        "fill_functor.h",
        "function_ops.cc",
        "function_ops.h",
        "gather_functor.h",
        "gather_nd_op.cc",
        "gather_nd_op.h",
        "gather_nd_op_cpu_impl.h",
        "gather_nd_op_cpu_impl_0.cc",
        "gather_nd_op_cpu_impl_1.cc",
        "gather_nd_op_cpu_impl_2.cc",
        "gather_nd_op_cpu_impl_3.cc",
        "gather_nd_op_cpu_impl_4.cc",
        "gather_nd_op_cpu_impl_5.cc",
        "gather_nd_op_cpu_impl_6.cc",
        "gather_nd_op_cpu_impl_7.cc",
        "gather_op.cc",
        "identity_n_op.cc",
        "identity_n_op.h",
        "identity_op.cc",
        "identity_op.h",
        "immutable_constant_op.cc",
        "immutable_constant_op.h",
        "matmul_op.cc",
        "matmul_op.h",
        "no_op.cc",
        "no_op.h",
        "non_max_suppression_op.cc",
        "non_max_suppression_op.h",
        "one_hot_op.cc",
        "one_hot_op.h",
        "ops_util.h",
        "pack_op.cc",
        "pooling_ops_common.h",
        "reshape_op.cc",
        "reshape_op.h",
        "reverse_sequence_op.cc",
        "reverse_sequence_op.h",
        "sendrecv_ops.cc",
        "sendrecv_ops.h",
        "sequence_ops.cc",
        "shape_ops.cc",
        "shape_ops.h",
        "slice_op.cc",
        "slice_op.h",
        "slice_op_cpu_impl.h",
        "slice_op_cpu_impl_1.cc",
        "slice_op_cpu_impl_2.cc",
        "slice_op_cpu_impl_3.cc",
        "slice_op_cpu_impl_4.cc",
        "slice_op_cpu_impl_5.cc",
        "slice_op_cpu_impl_6.cc",
        "slice_op_cpu_impl_7.cc",
        "softmax_op.cc",
        "softmax_op_functor.h",
        "split_lib.h",
        "split_lib_cpu.cc",
        "split_op.cc",
        "split_v_op.cc",
        "strided_slice_op.cc",
        "strided_slice_op.h",
        "strided_slice_op_impl.h",
        "strided_slice_op_inst_0.cc",
        "strided_slice_op_inst_1.cc",
        "strided_slice_op_inst_2.cc",
        "strided_slice_op_inst_3.cc",
        "strided_slice_op_inst_4.cc",
        "strided_slice_op_inst_5.cc",
        "strided_slice_op_inst_6.cc",
        "strided_slice_op_inst_7.cc",
        "unpack_op.cc",
        "variable_ops.cc",
        "variable_ops.h",
    ],
)

# Other kernels we may want on Android.
#
# The kernels can be consumed as a whole or in two groups for
# supporting separate compilation. Note that the split into groups
# is entirely for improving compilation time, and not for
# organizational reasons; you should not depend on any
# of those groups independently.
filegroup(
    name = "android_extended_ops",
    srcs = [
        ":android_extended_ops_group1",
        ":android_extended_ops_group2",
        ":android_quantized_ops",
    ],
    visibility = ["//visibility:public"],
)

filegroup(
    name = "android_extended_ops_headers",
    srcs = [
        "argmax_op.h",
        "avgpooling_op.h",
        "batch_matmul_op_impl.h",
        "batch_norm_op.h",
        "control_flow_ops.h",
        "conv_2d.h",
        "conv_ops.h",
        "data_format_ops.h",
        "depthtospace_op.h",
        "depthwise_conv_op.h",
        "fake_quant_ops_functor.h",
        "fused_batch_norm_op.h",
        "gemm_functors.h",
        "image_resizer_state.h",
        "initializable_lookup_table.h",
        "lookup_table_init_op.h",
        "lookup_table_op.h",
        "lookup_util.h",
        "maxpooling_op.h",
        "mfcc.h",
        "mfcc_dct.h",
        "mfcc_mel_filterbank.h",
        "mirror_pad_op.h",
        "mirror_pad_op_cpu_impl.h",
        "pad_op.h",
        "random_op.h",
        "reduction_ops.h",
        "reduction_ops_common.h",
        "relu_op.h",
        "relu_op_functor.h",
        "reshape_util.h",
        "resize_bilinear_op.h",
        "resize_nearest_neighbor_op.h",
        "reverse_op.h",
        "save_restore_tensor.h",
        "segment_reduction_ops.h",
        "softplus_op.h",
        "softsign_op.h",
        "spacetobatch_functor.h",
        "spacetodepth_op.h",
        "spectrogram.h",
        "string_util.h",
        "tensor_array.h",
        "tile_functor.h",
        "tile_ops_cpu_impl.h",
        "tile_ops_impl.h",
        "topk_op.h",
        "training_op_helpers.h",
        "training_ops.h",
        "transpose_functor.h",
        "transpose_op.h",
        "where_op.h",
        "xent_op.h",
    ],
)

filegroup(
    name = "android_extended_ops_group1",
    srcs = [
        "argmax_op.cc",
        "avgpooling_op.cc",
        "batch_matmul_op_real.cc",
        "batch_norm_op.cc",
        "bcast_ops.cc",
        "check_numerics_op.cc",
        "control_flow_ops.cc",
        "conv_2d.h",
        "conv_grad_filter_ops.cc",
        "conv_grad_input_ops.cc",
        "conv_grad_ops.cc",
        "conv_grad_ops.h",
        "conv_ops.cc",
        "conv_ops_fused.cc",
        "conv_ops_using_gemm.cc",
        "crop_and_resize_op.cc",
        "crop_and_resize_op.h",
        "cwise_op_abs.cc",
        "cwise_op_add_1.cc",
        "cwise_op_add_2.cc",
        "cwise_op_bitwise_and.cc",
        "cwise_op_bitwise_or.cc",
        "cwise_op_bitwise_xor.cc",
        "cwise_op_div.cc",
        "cwise_op_equal_to_1.cc",
        "cwise_op_equal_to_2.cc",
        "cwise_op_not_equal_to_1.cc",
        "cwise_op_not_equal_to_2.cc",
        "cwise_op_exp.cc",
        "cwise_op_floor.cc",
        "cwise_op_floor_div.cc",
        "cwise_op_floor_mod.cc",
        "cwise_op_greater.cc",
        "cwise_op_greater_equal.cc",
        "cwise_op_invert.cc",
        "cwise_op_isfinite.cc",
        "cwise_op_isnan.cc",
        "cwise_op_left_shift.cc",
        "cwise_op_less.cc",
        "cwise_op_less_equal.cc",
        "cwise_op_log.cc",
        "cwise_op_logical_and.cc",
        "cwise_op_logical_not.cc",
        "cwise_op_logical_or.cc",
        "cwise_op_maximum.cc",
        "cwise_op_minimum.cc",
        "cwise_op_mul_1.cc",
        "cwise_op_mul_2.cc",
        "cwise_op_neg.cc",
        "cwise_op_pow.cc",
        "cwise_op_reciprocal.cc",
        "cwise_op_right_shift.cc",
        "cwise_op_round.cc",
        "cwise_op_rsqrt.cc",
        "cwise_op_select.cc",
        "cwise_op_sigmoid.cc",
        "cwise_op_sign.cc",
        "cwise_op_sqrt.cc",
        "cwise_op_square.cc",
        "cwise_op_squared_difference.cc",
        "cwise_op_sub.cc",
        "cwise_op_tanh.cc",
        "cwise_op_xlogy.cc",
        "cwise_op_xdivy.cc",
        "data_format_ops.cc",
        "decode_wav_op.cc",
        "deep_conv2d.cc",
        "deep_conv2d.h",
        "depthwise_conv_op.cc",
        "dynamic_partition_op.cc",
        "encode_wav_op.cc",
        "fake_quant_ops.cc",
        "fifo_queue.cc",
        "fifo_queue_op.cc",
        "fused_batch_norm_op.cc",
        "listdiff_op.cc",
        "population_count_op.cc",
        "population_count_op.h",
        "winograd_transform.h",
        ":android_extended_ops_headers",
    ] + select({
        ":xsmm_convolutions": [
            "xsmm_conv2d.h",
            "xsmm_conv2d.cc",
        ],
        "//conditions:default": [],
    }),
)

filegroup(
    name = "android_extended_ops_group2",
    srcs = [
        "batchtospace_op.cc",
        "ctc_decoder_ops.cc",
        "decode_bmp_op.cc",
        "depthtospace_op.cc",
        "dynamic_stitch_op.cc",
        "in_topk_op.cc",
        "initializable_lookup_table.cc",
        "logging_ops.cc",
        "lookup_table_init_op.cc",
        "lookup_table_op.cc",
        "lookup_util.cc",
        "lrn_op.cc",
        "maxpooling_op.cc",
        "mfcc.cc",
        "mfcc_dct.cc",
        "mfcc_mel_filterbank.cc",
        "mfcc_op.cc",
        "mirror_pad_op.cc",
        "mirror_pad_op_cpu_impl_1.cc",
        "mirror_pad_op_cpu_impl_2.cc",
        "mirror_pad_op_cpu_impl_3.cc",
        "mirror_pad_op_cpu_impl_4.cc",
        "mirror_pad_op_cpu_impl_5.cc",
        "pad_op.cc",
        "padding_fifo_queue.cc",
        "padding_fifo_queue_op.cc",
        "queue_base.cc",
        "queue_op.cc",
        "queue_ops.cc",
        "random_op.cc",
        "reduction_ops_all.cc",
        "reduction_ops_any.cc",
        "reduction_ops_common.cc",
        "reduction_ops_max.cc",
        "reduction_ops_mean.cc",
        "reduction_ops_min.cc",
        "reduction_ops_prod.cc",
        "reduction_ops_sum.cc",
        "relu_op.cc",
        "reshape_util.cc",
        "resize_bilinear_op.cc",
        "resize_nearest_neighbor_op.cc",
        "restore_op.cc",
        "reverse_op.cc",
        "save_op.cc",
        "save_restore_tensor.cc",
        "save_restore_v2_ops.cc",
        "segment_reduction_ops.cc",
        "session_ops.cc",
        "softplus_op.cc",
        "softsign_op.cc",
        "spacetobatch_functor.cc",
        "spacetobatch_op.cc",
        "spacetodepth_op.cc",
        "sparse_fill_empty_rows_op.cc",
        "sparse_reshape_op.cc",
        "sparse_to_dense_op.cc",
        "spectrogram.cc",
        "spectrogram_op.cc",
        "stack_ops.cc",
        "string_join_op.cc",
        "string_util.cc",
        "summary_op.cc",
        "tensor_array.cc",
        "tensor_array_ops.cc",
        "tile_functor_cpu.cc",
        "tile_ops.cc",
        "tile_ops_cpu_impl_1.cc",
        "tile_ops_cpu_impl_2.cc",
        "tile_ops_cpu_impl_3.cc",
        "tile_ops_cpu_impl_4.cc",
        "tile_ops_cpu_impl_5.cc",
        "tile_ops_cpu_impl_6.cc",
        "tile_ops_cpu_impl_7.cc",
        "topk_op.cc",
        "training_op_helpers.cc",
        "training_ops.cc",
        "transpose_functor_cpu.cc",
        "transpose_op.cc",
        "unique_op.cc",
        "where_op.cc",
        "xent_op.cc",
        ":android_extended_ops_headers",
    ],
)

TensorFlow Mobile通过编译选项，在完整的TensorFlow基础上进行裁剪，在保留TensorFlow核心功能的同时去掉不必要的代码。例如分布式执行的逻辑，windows平台的兼容逻辑，利用gpu计算的逻辑等等。

TensorFlow Mobile的OP支持完整吗？

TensorFlow Mobile并不包含所有的OP，只有一些核心必要的op，详见上面android_core_ops和android_extended_ops。

TensorFlow Lite在实现上又有啥区别

TensorFlow Lite的源码在tensorflow/contrib/lite目录下。其核心编译逻辑如下

### tensorflow/contrib/lite/BUILD
cc_library(
    name = "framework",
    srcs = [
        "allocation.cc",
        "graph_info.cc",
        "interpreter.cc",
        "model.cc",
        "mutable_op_resolver.cc",
        "optional_debug_tools.cc",
        "stderr_reporter.cc",
    ] + select({
        "//tensorflow:android": [
            "nnapi_delegate.cc",
            "mmap_allocation.cc",
        ],
        "//tensorflow:windows": [
            "nnapi_delegate_disabled.cc",
            "mmap_allocation_disabled.cc",
        ],
        "//conditions:default": [
            "nnapi_delegate_disabled.cc",
            "mmap_allocation.cc",
        ],
    }),
    hdrs = [
        "allocation.h",
        "context.h",
        "context_util.h",
        "error_reporter.h",
        "graph_info.h",
        "interpreter.h",
        "model.h",
        "mutable_op_resolver.h",
        "nnapi_delegate.h",
        "op_resolver.h",
        "optional_debug_tools.h",
        "stderr_reporter.h",
    ],
    copts = tflite_copts(),
    linkopts = [
    ] + select({
        "//tensorflow:android": [
            "-llog",
        ],
        "//conditions:default": [
        ],
    }),
    deps = [
        ":arena_planner",
        ":graph_info",
        ":memory_planner",
        ":schema_fbs_version",
        ":simple_memory_arena",
        ":string",
        ":util",
        "//tensorflow/contrib/lite/c:c_api_internal",
        "//tensorflow/contrib/lite/core/api",
        "//tensorflow/contrib/lite/kernels:eigen_support",
        "//tensorflow/contrib/lite/kernels:gemm_support",
        "//tensorflow/contrib/lite/nnapi:nnapi_lib",
        "//tensorflow/contrib/lite/profiling:profiler",
        "//tensorflow/contrib/lite/schema:schema_fbs",
    ],
)

相比TensorFlow Mobile是对完整TensorFlow的裁减，TensorFlow Lite基本就是重新实现了。从内部实现来说，在TensorFlow内核最基本的OP，Context等数据结构，都是新的。从外在表现来说，模型文件从PB格式改成了FlatBuffers格式，TensorFlow的size有大幅度优化，降至300K，然后提供一个converter将普通TensorFlow模型转化成TensorFlow Lite需要的格式。因此，无论从哪方面看，TensorFlow Lite都是一个新的实现方案。

参考资料

TensorFlow Architecture
TensorFlow Mobile VS TensorFlow Lite
TensorFlow代码解析
 TensorFlow Lite

TensorFlow VS TensorFlow Mobile VS TensorFlow Lite

TensorFlow的简介

TensorFlow的代码架构

Client

Distributed Master

Worker Services

Kernel Implementations

移动端的TensorFlow

从源码看区别

TensorFlow 代码目录介绍

TensorFlow Mobile精简了啥？

TensorFlow默认的编译配置

TensorFlow Mobile的编译配置

TensorFlow Mobile的OP支持完整吗？

TensorFlow Lite在实现上又有啥区别

参考资料

热门文章

最新文章

相关课程

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

TensorFlow VS TensorFlow Mobile VS TensorFlow Lite

TensorFlow的简介

TensorFlow的代码架构

Client

Distributed Master

Worker Services

Kernel Implementations

移动端的TensorFlow

从源码看区别

TensorFlow 代码目录介绍

TensorFlow Mobile精简了啥？

TensorFlow默认的编译配置

TensorFlow Mobile的编译配置

TensorFlow Mobile的OP支持完整吗？

TensorFlow Lite在实现上又有啥区别

参考资料

热门文章

最新文章

相关课程

相关电子书