Anchor boxes and offsets in SSD object detection

Question

How do you calculate anchor box offsets for object detection in SSD? As far as I understood anchor boxes are the boxes in 8x8 feature map, 4x4 feature map, or any other feature map in the output layer.

So what are the offsets?

Is it the distance between the centre of the bounding box and the centre of a particular box in say a 4x4 feature map?

If I am using a 4x4 feature map as my output, then my output should be of the dimension:

(4x4, n_classes + 4)

where 4 is for my anchor box co-ordinates. This 4 co-ordinates can be something like:

(xmin, xmax, ymin, ymax)

This will correspond to the top-left and bottom-right corners of the bounding box. So why do we need offsets and if so how do we calculate them?

Any help would be really appreciated!

Anchor boxes and offsets in SSD object detection

Answers (1)

Related Questions