Jazz Scout
Jazz Scout

Reputation: 49

Data Augmentation for Object Detection using Deep Learning

I have a question regarding data augmentation for training the deep neural network for object detection.

I have quite limited data set (nearly 300 images). I augmented the data by rotating each image from 0-360 degrees with stepsize of 15 degree. Consequently I got 24 rotated images out of just one. So in total, I got around 7200 images. Then I drew bounding box around the object of interest in each augmented image.

Does it seem to be a reasonable approach to enhance the data?

Best Regards

Upvotes: 1

Views: 4212

Answers (4)

Rodrigo Loza
Rodrigo Loza

Reputation: 1248

Even though rotation increases the representational complexity of your image, it might be not enough. Instead you probably need to add other types of augmentation as well.

Color augmentations are useful if they still represent the real distribution of your data.

Spatial augmentations work very good. Keep in mind that most modern systems use a lot of cropping, so that might help.

Actually I have a few scripts that I am trying to turn into a library that might work for you. Check them https://github.com/lozuwa/impy if you would like to.

Upvotes: 0

kmario23
kmario23

Reputation: 61355

This is a good approach as long as you don't implicitly change the labels when you do rotation. E.g. An image containing the digit 6 will become digit 9 on rotation of 180 deg. So, you've to pay some attention in such scenarios.

But, you could also do other geometric transformations like scaling, translation

Other augmentation that you can consider is using the pre-trained model such as ImageNet, if your problem domain has some resemblance to the ImageNet data. This will allow you to train deeper models even for your data scarce situation.

Upvotes: 0

Sergii Gryshkevych
Sergii Gryshkevych

Reputation: 4159

It seems like you are on the right track, rotation is usually a very useful transformation for augmenting the training data. I would suggest to try other transformations like shift (you most probably want to detect partially present objects), zoom (makes your model invariant to the scale), shear, flip, etc. By combining different transformations you can introduce additional diversity in your training data. Training set of 300 images is a very small number, so you would definitely need more than one transformation to augment so tiny training set.

Upvotes: 1

lejlot
lejlot

Reputation: 66805

In order to train a good model you need lots of representative data. Your augmentation is representative only for rotations, so yes, it is a good method, if you are concerned about having not enough object rotations. However, it will not help in any sense with generalization to other objects/transformations.

Upvotes: 2

Related Questions