Reputation: 117
I have used the API
(https://github.com/tensorflow/models/tree/master/object_detection)
And then,
How would I know the length of bounding box?
I have used Tutorial IPython notebook on github in real-time.
But I don't know use which command to calculate the length of boxes.
Upvotes: 9
Views: 14515
Reputation: 1
The following code that recognizes objects and returns the information for the locations and confidence is:
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
To iterate through the boxes
for i,b in enumerate(boxes[0]):
To get width and height:
width = boxes[0][i][1]+boxes[0][i][3]
height = boxes[0][i][0]+boxes[0][i][2]
You can find more details: [https://pythonprogramming.net/detecting-distances-self-driving-car/]
Upvotes: 0
Reputation: 1081
Just to extend Beta's answer:
You can get the predicted bounding boxes from the detection graph. An example for this is given in the Tutorial IPython notebook on github. This is where Beta's code snipped comes from. Access the detection_graph
and extract the coordinates of the predicted bounding boxes from the tensor:
By calling np.squeeze(boxes)
you reshape them to (m, 4), where m denotes the amount of predicted boxes. You can now access the boxes and compute the length, area or what ever you want.
But remember that the predicted box coordinates are normalized! They are in the following order:
[ymin, xmin, ymax, xmax]
So computing the length in pixel would be something like:
def length_of_bounding_box(bbox):
return bbox[3]*IMG_WIDTH - bbox[1]*IMG_WIDTH
Upvotes: 12
Reputation: 468
I wrote a full answer on how to find the bounding box coordinates here and thought it might be useful to someone on this thread too.
Google Object Detection API returns bounding boxes in the format [ymin, xmin, ymax, xmax] and in normalised form (full explanation here). To find the (x,y) pixel coordinates we need to multiply the results by width and height of the image. First get the width and height of your image:
width, height = image.size
Then, extract ymin,xmin,ymax,xmax from the boxes
object and multiply to get the (x,y) coordinates:
ymin = boxes[0][i][0]*height
xmin = boxes[0][i][1]*width
ymax = boxes[0][i][2]*height
xmax = boxes[0][i][3]*width
Finally print the coordinates of the box corners:
print 'Top left'
print (xmin,ymin,)
print 'Bottom right'
print (xmax,ymax)
Upvotes: 3
Reputation: 9
Basically, you can get all those from the graph
image_tensor = graph.get_tensor_by_name('image_tensor:0')
boxes = graph.get_tensor_by_name('detection_boxes:0')
scores = graph.get_tensor_by_name('detection_scores:0')
classes = graph.get_tensor_by_name('detection_classes:0')
num_detections = graph.get_tensor_by_name('num_detections:0')
and boxes[0] contains all predicted bounding box coordinate in format of [top_left_x, top_left_y, bottom_right_x, bottom_right_y], which is what you are looking for.
Check out this repo and you may find more details: https://github.com/KleinYuan/tf-object-detection
Upvotes: 0
Reputation: 1756
You can call boxes, like the following:
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
similarly for scores, and classes.
Then just call them in session run.
(boxes, scores, classes) = sess.run(
[boxes, scores, classes],
feed_dict={image_tensor: imageFile})
Upvotes: 1