Reputation: 893
I have a quantized mobilenet downloaded from here, This graph contains fake quantization nodes during training, to simulate test time output. I want to collect output from the last pointwise convolutional layer of this network.
The quantized frozen model contains additional fc,softmax etc layers that are of no use for my application.
I have the following code for loading the graph.
def load_graph(frozen_graph_filename):
# We load the protobuf file from the disk and parse it to retrieve the
# unserialized graph_def
with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Graph().as_default() as graph:
# The name var will prefix every op/nodes in your graph
# Since we load everything in a new graph, this is not needed
tf.import_graph_def(graph_def, name="prefix")
return graph
graph1 = load_graph("./quantized_fake.pb")
input = graph1.get_tensor_by_name('prefix/input:0')
output = graph1.get_tensor_by_name('prefix/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Conv2D_Fold:0')
Then run using sess.run() , however i observe the output of the convolution layer is not quantized(8 bits) as it should be when it will be running on a mobile.
How can i produce the same output as would be produced on a mobile device while running the code on my pc.
Can the tflite files be used for inference on pc?
Upvotes: 1
Views: 1146
Reputation: 1088
The TensorFlow fake quantized graph isn't actually quantized, it has FakeQuantization operations inserted that emulate quantization. These are only converted to a fully quantized operations by TensorFlow Lite. This is why running the TensorFlow fake quantized graph will only result in float values not quantized values.
TensorFlow Lite quantization is currently CPU-only, and can be run on the CPU of a PC. Here is an example, of how to invoke the TFLite interpreter to run on your PC.
Upvotes: 2