Reputation: 38155
How to find the values to pass to the transforms.Normalize function in PyTorch? Also, where in my code, should I exactly do the transforms.Normalize?
Since normalizing the dataset is a pretty well-known task, I was hoping there should be some sort of script for doing that automatically. At least I couldn't find it in PyTorch forum.
transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',
root_dir='.',
transform=transforms.Compose([
Rescale(256),
RandomCrop(224),
transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
std = [ 0.229, 0.224, 0.225 ]),
ToTensor()
]))
for i in range(len(transformed_dataset)):
sample = transformed_dataset[i]
print(i, sample['image'].size(), sample['landmarks'].size())
if i == 3:
break
I know these current values don't pertain to my dataset and pertain to ImageNet but using them I actually get an error:
TypeError Traceback (most recent call last)
<ipython-input-81-eb8dc46e0284> in <module>
10
11 for i in range(len(transformed_dataset)):
---> 12 sample = transformed_dataset[i]
13
14 print(i, sample['image'].size(), sample['landmarks'].size())
<ipython-input-48-9d04158922fb> in __getitem__(self, idx)
30
31 if self.transform:
---> 32 sample = self.transform(sample)
33
34 return sample
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, img)
59 def __call__(self, img):
60 for t in self.transforms:
---> 61 img = t(img)
62 return img
63
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, tensor)
210 Tensor: Normalized Tensor image.
211 """
--> 212 return F.normalize(tensor, self.mean, self.std, self.inplace)
213
214 def __repr__(self):
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/functional.py in normalize(tensor, mean, std, inplace)
278 """
279 if not torch.is_tensor(tensor):
--> 280 raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))
281
282 if tensor.ndimension() != 3:
TypeError: tensor should be a torch tensor. Got <class 'dict'>.
So basically three questions:
Trying the provided solution here didn't work for me: https://discuss.pytorch.org/t/about-normalization-using-pre-trained-vgg16-networks/23560/6?u=mona_jalal
mean = 0.
std = 0.
nb_samples = 0.
for data in dataloader:
print(type(data))
batch_samples = data.size(0)
data.shape(0)
data = data.view(batch_samples, data.size(1), -1)
mean += data.mean(2).sum(0)
std += data.std(2).sum(0)
nb_samples += batch_samples
mean /= nb_samples
std /= nb_samples
error is:
<class 'dict'>
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-51-e8ba3c8718bb> in <module>
5 for data in dataloader:
6 print(type(data))
----> 7 batch_samples = data.size(0)
8
9 data.shape(0)
AttributeError: 'dict' object has no attribute 'size'
this is print(data) result:
{'image': tensor([[[[0.2961, 0.2941, 0.2941, ..., 0.2460, 0.2456, 0.2431],
[0.2953, 0.2977, 0.2980, ..., 0.2442, 0.2431, 0.2431],
[0.2941, 0.2941, 0.2980, ..., 0.2471, 0.2471, 0.2448],
...,
[0.3216, 0.3216, 0.3216, ..., 0.2482, 0.2471, 0.2471],
[0.3216, 0.3241, 0.3253, ..., 0.2471, 0.2471, 0.2450],
[0.3216, 0.3216, 0.3216, ..., 0.2471, 0.2452, 0.2431]],
[[0.2961, 0.2941, 0.2941, ..., 0.2460, 0.2456, 0.2431],
[0.2953, 0.2977, 0.2980, ..., 0.2442, 0.2431, 0.2431],
[0.2941, 0.2941, 0.2980, ..., 0.2471, 0.2471, 0.2448],
...,
[0.3216, 0.3216, 0.3216, ..., 0.2482, 0.2471, 0.2471],
[0.3216, 0.3241, 0.3253, ..., 0.2471, 0.2471, 0.2450],
[0.3216, 0.3216, 0.3216, ..., 0.2471, 0.2452, 0.2431]],
[[0.2961, 0.2941, 0.2941, ..., 0.2460, 0.2456, 0.2431],
[0.2953, 0.2977, 0.2980, ..., 0.2442, 0.2431, 0.2431],
[0.2941, 0.2941, 0.2980, ..., 0.2471, 0.2471, 0.2448],
...,
[0.3216, 0.3216, 0.3216, ..., 0.2482, 0.2471, 0.2471],
[0.3216, 0.3241, 0.3253, ..., 0.2471, 0.2471, 0.2450],
[0.3216, 0.3216, 0.3216, ..., 0.2471, 0.2452, 0.2431]]],
[[[0.3059, 0.3093, 0.3140, ..., 0.3373, 0.3363, 0.3345],
[0.3059, 0.3093, 0.3165, ..., 0.3412, 0.3389, 0.3373],
[0.3098, 0.3131, 0.3176, ..., 0.3450, 0.3412, 0.3412],
...,
[0.2931, 0.2966, 0.2931, ..., 0.2549, 0.2539, 0.2510],
[0.2902, 0.2902, 0.2902, ..., 0.2510, 0.2510, 0.2502],
[0.2864, 0.2900, 0.2863, ..., 0.2510, 0.2510, 0.2510]],
[[0.3059, 0.3093, 0.3140, ..., 0.3373, 0.3363, 0.3345],
[0.3059, 0.3093, 0.3165, ..., 0.3412, 0.3389, 0.3373],
[0.3098, 0.3131, 0.3176, ..., 0.3450, 0.3412, 0.3412],
...,
[0.2931, 0.2966, 0.2931, ..., 0.2549, 0.2539, 0.2510],
[0.2902, 0.2902, 0.2902, ..., 0.2510, 0.2510, 0.2502],
[0.2864, 0.2900, 0.2863, ..., 0.2510, 0.2510, 0.2510]],
[[0.3059, 0.3093, 0.3140, ..., 0.3373, 0.3363, 0.3345],
[0.3059, 0.3093, 0.3165, ..., 0.3412, 0.3389, 0.3373],
[0.3098, 0.3131, 0.3176, ..., 0.3450, 0.3412, 0.3412],
...,
[0.2931, 0.2966, 0.2931, ..., 0.2549, 0.2539, 0.2510],
[0.2902, 0.2902, 0.2902, ..., 0.2510, 0.2510, 0.2502],
[0.2864, 0.2900, 0.2863, ..., 0.2510, 0.2510, 0.2510]]],
[[[0.2979, 0.2980, 0.3015, ..., 0.2825, 0.2784, 0.2784],
[0.2980, 0.2980, 0.2980, ..., 0.2830, 0.2764, 0.2795],
[0.2980, 0.2980, 0.3012, ..., 0.2827, 0.2814, 0.2797],
...,
[0.3282, 0.3293, 0.3294, ..., 0.2238, 0.2235, 0.2235],
[0.3255, 0.3255, 0.3255, ..., 0.2240, 0.2235, 0.2229],
[0.3225, 0.3255, 0.3255, ..., 0.2216, 0.2235, 0.2223]],
[[0.2979, 0.2980, 0.3015, ..., 0.2825, 0.2784, 0.2784],
[0.2980, 0.2980, 0.2980, ..., 0.2830, 0.2764, 0.2795],
[0.2980, 0.2980, 0.3012, ..., 0.2827, 0.2814, 0.2797],
...,
[0.3282, 0.3293, 0.3294, ..., 0.2238, 0.2235, 0.2235],
[0.3255, 0.3255, 0.3255, ..., 0.2240, 0.2235, 0.2229],
[0.3225, 0.3255, 0.3255, ..., 0.2216, 0.2235, 0.2223]],
[[0.2979, 0.2980, 0.3015, ..., 0.2825, 0.2784, 0.2784],
[0.2980, 0.2980, 0.2980, ..., 0.2830, 0.2764, 0.2795],
[0.2980, 0.2980, 0.3012, ..., 0.2827, 0.2814, 0.2797],
...,
[0.3282, 0.3293, 0.3294, ..., 0.2238, 0.2235, 0.2235],
[0.3255, 0.3255, 0.3255, ..., 0.2240, 0.2235, 0.2229],
[0.3225, 0.3255, 0.3255, ..., 0.2216, 0.2235, 0.2223]]]],
dtype=torch.float64), 'landmarks': tensor([[[160.2964, 98.7339],
[223.0788, 72.5067],
[ 82.4163, 70.3733],
[152.3213, 137.7867]],
[[198.3194, 74.4341],
[273.7188, 118.7733],
[117.7113, 80.8000],
[182.0750, 107.2533]],
[[137.4789, 92.8523],
[174.9463, 40.3467],
[ 57.3013, 59.1200],
[129.3375, 131.6533]]], dtype=torch.float64)}
dataloader = DataLoader(transformed_dataset, batch_size=3,
shuffle=True, num_workers=4)
and
transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',
root_dir='.',
transform=transforms.Compose(
[
Rescale(256),
RandomCrop(224),
ToTensor()#,
##transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
## std = [ 0.229, 0.224, 0.225 ])
]
)
)
and
class MothLandmarksDataset(Dataset):
"""Face Landmarks dataset."""
def __init__(self, csv_file, root_dir, transform=None):
"""
Args:
csv_file (string): Path to the csv file with annotations.
root_dir (string): Directory with all the images.
transform (callable, optional): Optional transform to be applied
on a sample.
"""
self.landmarks_frame = pd.read_csv(csv_file)
self.root_dir = root_dir
self.transform = transform
def __len__(self):
return len(self.landmarks_frame)
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0])
image = io.imread(img_name)
landmarks = self.landmarks_frame.iloc[idx, 1:]
landmarks = np.array([landmarks])
landmarks = landmarks.astype('float').reshape(-1, 2)
sample = {'image': image, 'landmarks': landmarks}
if self.transform:
sample = self.transform(sample)
return sample
Upvotes: 0
Views: 2650
Reputation: 24691
How to pass these values and where? I assume I should do it in transforms.Compose method but I might be wrong.
In MothLandmarksDataset
it is no wonder it is not working as you are trying to pass Dict
(sample
) to torchvision.transforms
which require either torch.Tensor
or PIL.Image
as input. here to be exact:
sample = {'image': image, 'landmarks': landmarks}
if self.transform:
sample = self.transform(sample)
You could pass sample["image"]
into it although you shouldn't. Applying this operation only to sample["image"]
would break its relation to landmarks
. What you should be after is something like albumentations
library (see here) which can transform image
and landmarks
in the same way to preserve their relations.
Also there is no Rescale
transform in torchvision
, maybe you meant Resize?
Provided code is fine, but you have to unpack your data into torch.Tensor
like this:
mean = 0.0
std = 0.0
nb_samples = 0.0
for data in dataloader:
images, landmarks = data["image"], data["landmarks"]
batch_samples = images.size(0)
images_data = images.view(batch_samples, images.size(1), -1)
mean += images_data.mean(2).sum(0)
std += images_data.std(2).sum(0)
nb_samples += batch_samples
mean /= nb_samples
std /= nb_samples
How to pass these values and where? I assume I should do it in transforms.Compose method but I might be wrong.
Those values should be passed to torchvision.transforms.Normalize
applied only to sample["images"]
, not to sample["landmarks"]
.
I assume I should apply Normalize to my entire dataset not just the training set, am I right?
You should calculate normalization values across training dataset and apply those calculated values to validation and test as well.
Upvotes: 2