Reputation: 368
I've been benchmarking tensorflow models on Exynos 7420 with benchmark_model. I'd like to speed test Quantization per Pete Warden's blog but have not been able to compile benchmark_model with quantization deps yet as they break a number of things.
I've followed the guidelines listed in this stack overflow thread:
//tensorflow/tools/benchmark/BUILD cc_binary
deps = [":benchmark_model_lib",
"//tensorflow/contrib/quantization/kernels:quantized_ops",
],
//tensorflow/contrib/quantization/kernels/BUILD:
deps = [
"//tensorflow/contrib/quantization:cc_array_ops",
"//tensorflow/contrib/quantization:cc_math_ops",
"//tensorflow/contrib/quantization:cc_nn_ops",
#"//tensorflow/core",
#"//tensorflow/core:framework",
#"//tensorflow/core:lib",
#"//tensorflow/core/kernels:concat_lib_hdrs",
#"//tensorflow/core/kernels:conv_ops",
#"//tensorflow/core/kernels:eigen_helpers",
#"//tensorflow/core/kernels:ops_util",
#"//tensorflow/core/kernels:pooling_ops",
"//third_party/eigen3",
"@gemmlowp//:eight_bit_int_gemm",
],
Then run:
bazel build -c opt --cxxopt='-std=gnu++11'--crosstool_top=//external:android/crosstool --cpu=armeabi-v7a --host_crosstool_top=@bazel_tools//tools/cpp:toolchain tensorflow/tools/benchmark:benchmark_model --verbose_failures
Which (with following all other instructions in linked post) succeeds with the exception that it fails to link against pthread.
I've tried removing -lpthread in tensorflow/tensorflow.bzl tfcopts(), and similarly in tensorflow/tools/proto_text/BUILD, and tensorflow/cc/BUILD.
def tf_copts():
return (["-fno-exceptions", "-DEIGEN_AVOID_STL_ARRAY"] +
if_cuda(["-DGOOGLE_CUDA=1"]) +
if_android_arm(["-mfpu=neon"]) +
select({"//tensorflow:android": [
"-std=c++11",
"-DMIN_LOG_LEVEL=0",
"-DTF_LEAN_BINARY",
"-O2",
],
"//tensorflow:darwin": [],
"//tensorflow:ios": ["-std=c++11",],
#"//conditions:default": ["-lpthread"]}))
"//conditions:default": []}))
Still getting getting the link error.
external/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../../arm-linux-androideabi/bin/ld: error: cannot find -lpthread
collect2: error: ld returned 1 exit status
Any help much appreciated, I'm pretty stuck.
Env:
Upvotes: 1
Views: 640
Reputation: 368
Transcribing GitHub answer from Andrew Harp on TF team. Thanks!!!
The above changes were all unnecessary. You can get quantization working for benchmark_model (or any target dependent on android_tensorflow_lib) with the following:
`
diff --git a/tensorflow/core/BUILD b/tensorflow/core/BUILD
--- a/tensorflow/core/BUILD
+++ b/tensorflow/core/BUILD
@@ -713,8 +713,11 @@ cc_library(
# binary size (by packaging a reduced operator set) is a concern.
cc_library(
name = "android_tensorflow_lib",
- srcs = if_android([":android_op_registrations_and_gradients"]),
- copts = tf_copts(),
+ srcs = if_android([":android_op_registrations_and_gradients",
+ "//tensorflow/contrib/quantization:android_ops",
+ "//tensorflow/contrib/quantization/kernels:android_ops",
+ "@gemmlowp//:eight_bit_int_gemm_sources"]),
+ copts = tf_copts() + ["-Iexternal/gemmlowp"],
linkopts = ["-lz"],
tags = [
"manual",
Just tested, works great. Interestingly, quantization produces graphs 1/4 the size but inference execution 4-5x as slow as unquantized graphs - seems like the quantized ops are still being optimized.
Upvotes: 2