Is it the right way to process a PIL image in pytorch?

Question

I want to train a deep model from .png images by using Pytorch. I'm using a pre-trained model on ImageNet so I need to normalize images before feeding them to the network, but when I look at the result of the transform I see there are some values more than 1 and some less than -1. I'm wondering shouldn't they all be in [-1, 1] range? Am I doing it in the right way? There is my code:

normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
preprocess = transforms.Compose([
transforms.ToTensor(),
normalize])

x = Image.open("path/to/the/file").convert('RGB')
x = self.transform(x) # I feed the network by x

Szymon Maszke · Accepted Answer

What you are doing is correct but mean and std are not calculated based on your data, rather you've taken those values from ImageNet dataset.

There will be some images which are out of [-1, 1] range as they weren't part of the mean and std calculations in the first place and it's expected during test. Also there are images outside this range as it changes mean and standard deviation to 0 and 1 respectively, hence there are samples which are outside this range.

If you wish to fine-tune your neural network you should calculate per-channel mean and std and input those values instead (though it might not make a large difference, depending on dataset and how many images you have).

Is it the right way to process a PIL image in pytorch?

Answers (1)

Related Questions