Reputation: 849
I have created a simple model in TF to identify cars. it identified the bellow image as a car:
What I would like to have is the area(or crop of the area) of the actual identified car as follows:
any ideas if it is possible with Tensorflow? my current python code is looking like that:
file_name = 'mustangTest.png'
input_height = 299
input_width = 299
input_mean = 0
input_std = 255
input_layer = "Mul"
output_layer = "final_result"
t = read_tensor_from_image_file(src,input_height=input_height,input_width=input_width,input_mean=input_mean,input_std=input_std)
input_name = "import/" + input_layer
output_name = "import/" + output_layer
input_operation = graph.get_operation_by_name(input_name);
output_operation = graph.get_operation_by_name(output_name);
with tf.Session(graph=graph) as sess:
results = sess.run(output_operation.outputs[0],{input_operation.outputs[0]: t})
results = np.squeeze(results)
top_k = results.argsort()[-5:][::-1]
print("car is " + top_k[0]")
Upvotes: 1
Views: 122
Reputation: 19123
Initial note: Since you talk about having "created a simple model" and having said model "identify this image as car", I'll assume you're not actually using a model for object detection, but one that does simple classification.
The problem you're trying to solve is a different problem than the one you trained your network to solve.
You have a network that was trained to tell you whether an image you feed to it contains a car. This is a classification problem.
What you want now is the area where the car actually is in the image. This is a much harder problem to solve, because now your network doesn't need to output anymore "I see a car" vs. "I don't see a car", but instead, in the simplest formulation, "I see a car in the rectangle (x,y,w,h)". In another formulation, more similar to what your desired output would be, you would have per each pixel a classification like "it's a car" or "not a car". These problems are then object detection and segmentation.
There are studies out there that tackle these problems (one example and another), but my suggestion is to have a look at Tensorflow's object detection API which have pretrained models you might exploit for your use-case.
Upvotes: 1