How to get the identified area from tensor flow?

Question

I have created a simple model in TF to identify cars. it identified the bellow image as a car:

What I would like to have is the area(or crop of the area) of the actual identified car as follows:

any ideas if it is possible with Tensorflow? my current python code is looking like that:

file_name = 'mustangTest.png'
input_height = 299
input_width = 299
input_mean = 0
input_std = 255
input_layer = "Mul"
output_layer = "final_result"
t = read_tensor_from_image_file(src,input_height=input_height,input_width=input_width,input_mean=input_mean,input_std=input_std)
        input_name = "import/" + input_layer
        output_name = "import/" + output_layer
        input_operation = graph.get_operation_by_name(input_name);
        output_operation = graph.get_operation_by_name(output_name);

        with tf.Session(graph=graph) as sess:
            results = sess.run(output_operation.outputs[0],{input_operation.outputs[0]: t})
            results = np.squeeze(results)
        top_k = results.argsort()[-5:][::-1]
        print("car is " + top_k[0]")

GPhilo · Accepted Answer

Initial note: Since you talk about having "created a simple model" and having said model "identify this image as car", I'll assume you're not actually using a model for object detection, but one that does simple classification.

The problem you're trying to solve is a different problem than the one you trained your network to solve.

You have a network that was trained to tell you whether an image you feed to it contains a car. This is a classification problem.

What you want now is the area where the car actually is in the image. This is a much harder problem to solve, because now your network doesn't need to output anymore "I see a car" vs. "I don't see a car", but instead, in the simplest formulation, "I see a car in the rectangle (x,y,w,h)". In another formulation, more similar to what your desired output would be, you would have per each pixel a classification like "it's a car" or "not a car". These problems are then object detection and segmentation.

There are studies out there that tackle these problems (one example and another), but my suggestion is to have a look at Tensorflow's object detection API which have pretrained models you might exploit for your use-case.

How to get the identified area from tensor flow?

Answers (1)

Related Questions