How do I perform data augmentation in object localization

Question

Performing data augmentation for classification task is easy as most transform do not change the ground truth label of the image.

However in the case of object localization:

The position of the bounding box is relative to the crop that has been taken.
There can be the case that the bounding box is only partially in the crop window, do we perform some sort of clipping in this case.
There will also be the case that the object bounding box are not included in the crop, do we discard these examples during training.

I am unable to understand how such cases are handled in object localization. Most papers suggest the use of Multi-Scale training but dont address these issues.

How do I perform data augmentation in object localization

Answers (1)

Related Questions