Dwight Crow
Dwight Crow

Reputation: 368

Benchmarking Quantization on Android

I've been benchmarking tensorflow models on Exynos 7420 with benchmark_model. I'd like to speed test Quantization per Pete Warden's blog but have not been able to compile benchmark_model with quantization deps yet as they break a number of things.

I've followed the guidelines listed in this stack overflow thread:

//tensorflow/tools/benchmark/BUILD cc_binary

    deps = [":benchmark_model_lib",
            "//tensorflow/contrib/quantization/kernels:quantized_ops",
            ],

//tensorflow/contrib/quantization/kernels/BUILD:

deps = [
    "//tensorflow/contrib/quantization:cc_array_ops",
    "//tensorflow/contrib/quantization:cc_math_ops",
    "//tensorflow/contrib/quantization:cc_nn_ops",
    #"//tensorflow/core",
    #"//tensorflow/core:framework",
    #"//tensorflow/core:lib",
    #"//tensorflow/core/kernels:concat_lib_hdrs",
    #"//tensorflow/core/kernels:conv_ops",
    #"//tensorflow/core/kernels:eigen_helpers",
    #"//tensorflow/core/kernels:ops_util",
    #"//tensorflow/core/kernels:pooling_ops",
    "//third_party/eigen3",
    "@gemmlowp//:eight_bit_int_gemm",
],

Then run:

bazel build -c opt --cxxopt='-std=gnu++11'--crosstool_top=//external:android/crosstool --cpu=armeabi-v7a --host_crosstool_top=@bazel_tools//tools/cpp:toolchain tensorflow/tools/benchmark:benchmark_model --verbose_failures

Which (with following all other instructions in linked post) succeeds with the exception that it fails to link against pthread.

I've tried removing -lpthread in tensorflow/tensorflow.bzl tfcopts(), and similarly in tensorflow/tools/proto_text/BUILD, and tensorflow/cc/BUILD.

def tf_copts():
  return (["-fno-exceptions", "-DEIGEN_AVOID_STL_ARRAY"] +
          if_cuda(["-DGOOGLE_CUDA=1"]) +
          if_android_arm(["-mfpu=neon"]) +
          select({"//tensorflow:android": [
                    "-std=c++11",
                    "-DMIN_LOG_LEVEL=0",
                    "-DTF_LEAN_BINARY",
                    "-O2",
                  ],
                  "//tensorflow:darwin": [],
                  "//tensorflow:ios": ["-std=c++11",],
                  #"//conditions:default": ["-lpthread"]}))
                  "//conditions:default": []}))

Still getting getting the link error.

external/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../../arm-linux-androideabi/bin/ld: error: cannot find -lpthread
collect2: error: ld returned 1 exit status

Any help much appreciated, I'm pretty stuck.

Env:

Upvotes: 1

Views: 640

Answers (1)

Dwight Crow
Dwight Crow

Reputation: 368

Transcribing GitHub answer from Andrew Harp on TF team. Thanks!!!

The above changes were all unnecessary. You can get quantization working for benchmark_model (or any target dependent on android_tensorflow_lib) with the following:

  1. git pull --recurse-submodules (to get @gemmlowp libs, also can git clone --recursive)
  2. The following edit to //tensorflow/core/BUILD

`

diff --git a/tensorflow/core/BUILD b/tensorflow/core/BUILD
--- a/tensorflow/core/BUILD
+++ b/tensorflow/core/BUILD
@@ -713,8 +713,11 @@ cc_library(
 # binary size (by packaging a reduced operator set) is a concern.
 cc_library(
     name = "android_tensorflow_lib",
-    srcs = if_android([":android_op_registrations_and_gradients"]),
-    copts = tf_copts(),
+    srcs = if_android([":android_op_registrations_and_gradients",
+                       "//tensorflow/contrib/quantization:android_ops",
+                       "//tensorflow/contrib/quantization/kernels:android_ops",
+                       "@gemmlowp//:eight_bit_int_gemm_sources"]),
+    copts = tf_copts() + ["-Iexternal/gemmlowp"],
     linkopts = ["-lz"],
     tags = [
         "manual",

Just tested, works great. Interestingly, quantization produces graphs 1/4 the size but inference execution 4-5x as slow as unquantized graphs - seems like the quantized ops are still being optimized.

Upvotes: 2

Related Questions