Luv
Luv

Reputation: 273

anchor box or bounding boxes in Yolo or Faster RCNN

I don't know the difference between anchor box and bounding boxes, or proposal area. I am confused with these definitions. And I don't know the meaning of these boxes in the detection model, since the default length never changes! And finally, I confuse with the fact that RCNN series and Yolo series both output the prediction boxes location (x,y,w,h). Or output the delta position (ground truth_x - predicted_x)/prediction_w?

Upvotes: 11

Views: 6652

Answers (2)

linker
linker

Reputation: 891

Anchor Boxes: predefined landmark rectangles for bounding boxes to pick and use offsets to give location for a detected object

Bounding Box: predicted rectangle for a detected object relative to an anchor box

Basically the idea is comparable to landmarks used in object detection models like in Snapchat's camera. A set of nodes are pre-decided for the network on specific regions of the image based on how selfie portraits are characterised, the network learns how to offset the nodes relative to different faces fed into the network before a filter or mask is applied for some visual m*sturbation to really excite the user

Upvotes: 4

spl
spl

Reputation: 411

Bounding Boxes Bounding boxes are boxes that are predicted by the network. These predicted boxes are overwritten on the input image so that you can visually understand what the position ans shape of rectangle are detected by the prediction. That is, they are rectangles you can see in this youtube video.

Anchor Boxes We can put some assumption on the shapes of bounding boxes. For example, if we want to detect humans, we should search humans with some vertical rectangular boxes. They are anchor boxes. The anchor boxes are fed to the network, before training and prediction, as a list of some numbers, which is a series of pairs of width and height:

anchors = [1.08, 1.19, 3.42, 4.41, 6.63, 11.38, 9.42, 5.11, 16.62, 10.52]

This list above defines 5 anchor boxes. We can feed arbitrary number of anchor boxes to the network.

These values are determined from the training data with some statistical procedure.

Upvotes: 11

Related Questions