Augment Artificial Images to look like real Images for TF Object Detection

Question

I'm using the Faster R-CNN Inception ResNet v2 model from the Tensorflow Object Detection API to train a CNN for safety sign detection. Since I do not have a real image dataset, I have written a code which creates an artificial dataset consisting of computer drawed sign images and real-world background images from publicly available datasets (GTSDB, KITTI, etc.). While the trained model works great on unseen artificial images, it doesn't work well on real-world test images which I have taken with my iPhone 5SE.

I have already various data augmentation techniques from imgaug (https://imgaug.readthedocs.io/en/latest/index.html) and searched the internet for a solution, but without any success for real-world images.

Also resizing the test images to a size close to the size of the training images and converting them to .png has not any effect.

One of my artificial images, which I use for training (size: 640x480, format: .png):

And one of the real-world test images, which should be used to test the model (size: 3024x4032, format: .JPG):

My idea is that my images are not close enough to the real-world images and, therefore, the classification of the latter does not work. Another idea is, that my dataset doesn't include enough "normal" (not heavily augmented images, which haven't been augmented with a lot of gaussianNoise, rotation, scaling, etc) training images. Any ideas how I could make my training images more real? Any other input is also welcome.

Dmytro Prylipko · Accepted Answer

We did something similar in our company. Generally speaking, this is a bad idea and should be used only when no other way of getting such a data is possible. Indeed, spending a week on annotating real world data will give you much better result.

However, if you wish to follow this approach, here are the hints we found useful:

Try to put your artificial objects smartly, so use background close to its brightness, sharpness etc. so it looks a bit more natural
Put the signs on proper locations, so on walls, not on the road
Freeze your feature extractor network
Use a small learning rate (something like 5e-07).

These are simply recommendations against overfitting.

Also, I can see the resolutions on train and test sets are dramatically different. Resize the test set manually before feeding the images to your model.

Augment Artificial Images to look like real Images for TF Object Detection

Answers (1)

Related Questions