Reputation: 87
I have a hard time to convert a given tensorflow model into a tflite model and then use it. I already posted a question where I described my problem but didn't share the model I was working with, because I am not allowed to. Since I didn't find an answer this way, I tried to convert a public model (ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu).
Here is a colab tutorial from the object detection api. I just run the whole script without changes (its the same model) and downloaded the generated models (with and without metadata). I uploaded them here together with a sample picture from the coco17 train dataset.
I tried to use those models directly in python, but the results feel like garbage.
Here is the code I used, I followed this guide. I changed the indexes for rects, scores and classes because otherwise the results were not in the right format.
#interpreter = tf.lite.Interpreter("original_models/model.tflite")
interpreter = tf.lite.Interpreter("original_models/model_with_metadata.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
size = 640
def draw_rect(image, box):
y_min = int(max(1, (box[0] * size)))
x_min = int(max(1, (box[1] * size)))
y_max = int(min(size, (box[2] * size)))
x_max = int(min(size, (box[3] * size)))
# draw a rectangle on the image
cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (255, 255, 255), 2)
file = "images/000000000034.jpg"
img = cv2.imread(file)
new_img = cv2.resize(img, (size, size))
new_img = cv2.cvtColor(new_img, cv2.COLOR_BGR2RGB)
interpreter.set_tensor(input_details[0]['index'], [new_img.astype("f")])
interpreter.invoke()
rects = interpreter.get_tensor(
output_details[1]['index'])
scores = interpreter.get_tensor(
output_details[0]['index'])
classes = interpreter.get_tensor(
output_details[3]['index'])
for index, score in enumerate(scores[0]):
draw_rect(new_img,rects[0][index])
#print(rects[0][index])
print("scores: ",scores[0][index])
print("class id: ", classes[0][index])
print("______________________________")
cv2.imshow("image", new_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
This leads to the following console output
scores: 0.20041436
class id: 51.0
______________________________
scores: 0.08925027
class id: 34.0
______________________________
scores: 0.079722285
class id: 34.0
______________________________
scores: 0.06676647
class id: 71.0
______________________________
scores: 0.06626186
class id: 15.0
______________________________
scores: 0.059938848
class id: 86.0
______________________________
scores: 0.058229476
class id: 34.0
______________________________
scores: 0.053791136
class id: 37.0
______________________________
scores: 0.053478718
class id: 15.0
______________________________
scores: 0.052847564
class id: 43.0
______________________________
and the resulting image
I tried different images from the orinal training dataset and never got good results. I think the output layer is broken or maybe some postprocessing is missing?
I also tried to use the converting method given from the offical tensorflow documentaion.
import tensorflow as tf
saved_model_dir = 'tf_models/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8/saved_model/'
# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory
tflite_model = converter.convert()
# Save the model.
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
But when I try to use the model, I get a ValueError: Cannot set tensor: Dimension mismatch. Got 640 but expected 1 for dimension 1 of input 0.
Has anyone an idea what I am doing wrong?
Update: After Farmmakers advice, I tried changing the input dimensions of the model generating by the short script at the end. The shape before was:
[{'name': 'serving_default_input_tensor:0',
'index': 0,
'shape': array([1, 1, 1, 3], dtype=int32),
'shape_signature': array([ 1, -1, -1, 3], dtype=int32),
'dtype': numpy.uint8,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}}]
So adding one dimension would not be enough. Therefore I used interpreter.resize_tensor_input(0, [1,640,640,3])
. Now it works to feed an image through the net.
Unfortunately I sill can't make any sense of the output. Here is the print of the output details:
[{'name': 'StatefulPartitionedCall:6',
'index': 473,
'shape': array([ 1, 51150, 4], dtype=int32),
'shape_signature': array([ 1, 51150, 4], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}},
{'name': 'StatefulPartitionedCall:0',
'index': 2233,
'shape': array([1, 1], dtype=int32),
'shape_signature': array([ 1, -1], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}},
{'name': 'StatefulPartitionedCall:5',
'index': 2198,
'shape': array([1], dtype=int32),
'shape_signature': array([1], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}},
{'name': 'StatefulPartitionedCall:7',
'index': 493,
'shape': array([ 1, 51150, 91], dtype=int32),
'shape_signature': array([ 1, 51150, 91], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}},
{'name': 'StatefulPartitionedCall:1',
'index': 2286,
'shape': array([1, 1, 1], dtype=int32),
'shape_signature': array([ 1, -1, -1], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}},
{'name': 'StatefulPartitionedCall:2',
'index': 2268,
'shape': array([1, 1], dtype=int32),
'shape_signature': array([ 1, -1], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}},
{'name': 'StatefulPartitionedCall:4',
'index': 2215,
'shape': array([1, 1], dtype=int32),
'shape_signature': array([ 1, -1], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}},
{'name': 'StatefulPartitionedCall:3',
'index': 2251,
'shape': array([1, 1, 1], dtype=int32),
'shape_signature': array([ 1, -1, -1], dtype=int32),
'dtype': numpy.float32,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}}]
I added the so generated tflite model to the google drive.
Update2: I added a directory to the google drive which contains a notebook that uses the full size model and produces the correct output. If you execute the whole notebook it should produce the following image to your disk.
Upvotes: 0
Views: 1382
Reputation: 31
I followed the exact procedure you show (standard procedure mentioned in tensorflow doc).
First the output returned by the tflite model, unlike described in the official documentation, has a different format (different indexing).
boxes = get_output_tensor(interpreter, 1)
classes = get_output_tensor(interpreter, 3)
scores = get_output_tensor(interpreter, 0)
count = int(get_output_tensor(interpreter, 2))
Second, the number of retuned bounding boxes are always 10, and I can't figure out how to change that to the custom number of objects in my dataset.
Finally, the way I solved it is just by retrieving the bounding boxes using index 1, and filtering them out using the scores. However, the results I get are far from the original model's. Moreover, the tflite model takes way more time than the original model, opposite of what tflite is meant for. Probably, because I run it on my laptop, therefore x86 instruction set (tflite is optimize to run on ARM CPUs instead (mobile, raspberry pi)).
Upvotes: 1
Reputation: 146
For the models from Object Detection APIs to work well with TFLite, you have to convert it to TFLite-friendly graph that has custom op.
You can also try using TensorFlow Lite Model Maker
Upvotes: 1