no746
no746

Reputation: 548

The reason behind rgb image normalization parameters in pytorch

I've seen transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) both in lots of tutorials and pytorch docs, I know the first param is mean and the second one is std. I can't understand why the values for different channels differ.

Upvotes: 0

Views: 953

Answers (1)

jhso
jhso

Reputation: 3283

That can just be the distribution of the colours in the original dataset of the code authors (such as COCO or PascalVOC). It would only be through chance that all colours are equally represented. However, if you used the same mean in your case, I doubt it would make much of a difference due to the similarity of the means and stds.

For example, in my custom dataset taken from a GoPro camera, the means and standard deviations are as such:

mean: [0.2841186 , 0.32399923, 0.27048702],
std: [0.21937862, 0.26193094, 0.23754872]

where the means are hardly equal. This does not mean, however, that they are treated differently in the ML model. All this transform does is make sure that each feature is standardised (through z-score standardisation).

Think about it this way: if one colour is represented with a generally-higher intensity in your dataset (eg. if you have lots of vibrant "blue" pictures of the beach, the sky, lagoons, etc), than you will have to subtract a larger number from that channel to ensure that your data is standardised.

Upvotes: 2

Related Questions