enter_thevoid
enter_thevoid

Reputation: 163

clever image augmentation - random zoom out

i'm building a CNN to identify facial keypoints. i want to make the net more robust, so i thought about applying some zoom-out transforms because most pictures have about the same location of keypoints, so the net doesn't learn much.

my approach:

i want augmented images to keep the original image size so apply MaxPool2d and then random (not equal) padding until the original size is reached.

first question

is it going to work with simple average padding or zero padding? i'm sure it would be even better if i made the padding appear more like a background but is there a simple way to do that?

second question

the keypoints are the target vector, they come as a row vector of 30. i'm getting confused with the logic needed to transform them to the smaller space. generally if an original point was at (x=5,y=7) it transforms to (x=2,y=3)- i'm not sure about it but so far manually checked and it's correct. but what to do if to keypoints are in the same new pixel? i can't feed the network with less target values.

that's it. would be happy to hear your thoughts

Upvotes: 0

Views: 3919

Answers (1)

Poe Dator
Poe Dator

Reputation: 4903

I suggest to use torchvision.transforms.RandomResizedCrop as a part of your Compose statement. which will give you random zooms AND resize the resulting the images to some standard size. This avoids issues in both your questions.

Upvotes: 2

Related Questions