Tanvir
Tanvir

Reputation: 885

Coordinate system for Faster/Fast RCNN

I have been training Faster RCNN over custom dataset but with some anomalous results. The network's performance deteriorates for bot validation and training data, with the increase in training iterations. Even though the loss is decreasing, which is surprising. The objective is to detect leaves.

Below are the images at 200 and 165000 iterations respectively

Output at 200 Iterations

output at 165000 Iterations

The thing to note here is after 165000 iterations, the network starts to draw boxes at background too.

I think this is because of some fault in annotations for training data, as loss is decreasing with the training.

The annotations file that I made has a coordinate system similar to matlab, i.e. (0,0) as top left of the image and thus for each bounding box top left corner is (x_min, y_min) and bottom right is (x_max,y_max). Is this the way it is supposed to be, if that is so, what else could the problem be?

Upvotes: 1

Views: 878

Answers (1)

Mike
Mike

Reputation: 603

The paper Faster R-CNN encodes the rectangles and the anchors as x_center,y_center,width and height. This also depends on your choice of encoding the anchors I think. If you used the code from the original publication though I think you should refactor the boxes as described on the paper

For bounding box regression, we adopt the parameterizations of the 4 coordinates following [5]:

[...]

Where x, y, w, and h denote the box’s center coordinates and its width and height. Variables x, xa, and x∗ are for the predicted box, anchor box, and groundtruth box respectively (likewise for y, w, h)


Source: page 5 of https://arxiv.org/pdf/1506.01497v3

Upvotes: 1

Related Questions