Reputation: 29
I am a beginner in image recognition and need some help about preprocessing images.
resnet18
to do the recognition work. And I get:
In [3]: pretrainedmodels.pretrained_settings['resnet18']
Out[3]:
{'imagenet': {'url':
'https://download.pytorch.org/models/resnet18-
5c106cde.pth',
'input_space': 'RGB',
'input_size': [3, 224, 224],
'input_range': [0, 1],
'mean': [0.485, 0.456, 0.406],
'std': [0.229, 0.224, 0.225],
'num_classes': 1000}}
I find that the mean
and std
is quite different from my image dataset's.
How should I normalize my trainset? Use the mean and std above or use the mean and std I calculate myself?
train_set
, valid_set
and test_set
.I have two methods:
A.calculate their mean and std and normalize them individually
B.calculate the whole dataset's mean and std and then do normalization.
Which one is right?
3.When should i do normalization? Before data_augmentation or after data_augmentation?
Upvotes: 2
Views: 357
Reputation: 2177
If you training a new model with your own dataset, with the pre-trained weights, you will need to a new mean and std for your new dataset.
Basically you will need to repeat the process of how ImageNet did it. Make a script that calculates the general [mean, std]
value of your entire dataset.
But remember to keep watch on your dataset distribution as it will definitely affect the model performance.
Then define a transformer method individually for your train/valset. Usually we do not normalize the test set as in real world scenario your model will takes in data of different sort. You should perform the normalization process when building the dataset, together with other augmentation techniques.
For instance, consider this toy example
"transformer": {
"train": transforms.Compose([
transforms.Resize(size=299),
transforms.RandomHorizontalFlip(p=0.2),
transforms.ToTensor(),
transforms.Normalize(new_mean, new_std)
]),
"valid": transforms.Compose([
transforms.Resize(size=299),
transforms.ToTensor(),
])
}
train_ds = CustomDataset(type="train", transformer=transformer["train"])
valid_ds = CustomDataset(type="valid", transformer=transformer["valid"])
Let me know if you have more confusion
Upvotes: 1