Can object detection models adapt to rotation?

Question

Even though rotated versions were not included in the training images, the TensorFlow object-detection model outputs the correct bounding box predictions.

Is this regular behavior for object-detection models? If so, what (part of code or concept) ensures this adaptation?

DomJack · Accepted Answer

The simplest way rotation invariance can be simulated is by dataset augmentation, where input images are artificially rotated before being passed into the network by a different amount each time.

The rotation is generally constrained to some small value - e.g. -30 to 30 degrees - rather than completely random, since images usually have some standard orientation, and space in the scenes is generally not isotropic (i.e. up is different to sideways due to gravity).

Note these augmentations don't make the network inherently rotationally invariant. However, if the network learns well it should learn approximate rotational invariance.

Other forms of augmentation include flipping left-right (but not up-down, for the same reason as above), resizing and photometric variations like hue/saturation/contrast manipulations. In some cases, some (or all) of these are inappropriate. For example, for hand writing recognition, symbols are inherently asymetric, so flipping left-right would not be appropriate.

Can object detection models adapt to rotation?

Answers (1)

Related Questions