Reputation: 1250
How do you calculate anchor box offsets for object detection in SSD? As far as I understood anchor boxes are the boxes in 8x8 feature map, 4x4 feature map, or any other feature map in the output layer.
So what are the offsets?
Is it the distance between the centre of the bounding box and the centre of a particular box in say a 4x4 feature map?
If I am using a 4x4 feature map as my output, then my output should be of the dimension:
(4x4, n_classes + 4)
where 4 is for my anchor box co-ordinates. This 4 co-ordinates can be something like:
(xmin, xmax, ymin, ymax)
This will correspond to the top-left and bottom-right corners of the bounding box. So why do we need offsets and if so how do we calculate them?
Any help would be really appreciated!
Upvotes: 1
Views: 2462
Reputation: 10139
We need offsets because thats what we calculate when we default anchor boxes, In case of ssd for every feature map cell they will have predefined number of anchor boxes of different scale ratios on very feature map cell,I think in the paper this number is 6.
Now because this is a detection problem ,we will also have ground truth bounding boxes,Here roughly, we compare the IOU of the anchor box to the GT box and if it is greater than a threshold say 0.5 we predict the box offsets to that anchor box.
Upvotes: 1