Reputation: 131
I want to train some models to work with grayscale images, which e.g. is useful for microscope applications (Source). Therefore I want to train my model on graysale imagenet, using the pytorch grayscale conversion (torchvision.transforms.Grayscale), to convert the RGB imagenet to a grayscale imagenet. Internally pytorch rotates the color space from RGB to YPbPr as follows:
Y' is the grayscale channel then, so that Pb and Pr can be neglected after transformation. Actually pytorch even only calculates
grayscale = (0.2989 * r + 0.587 * g + 0.114 * b)
To normalize the image data, I need to know grayscale-imagenet's mean pixel value, as well as the standard deviation. Is it possible to calculate those?
I had success in calculating the mean pixel intensity using
meanGrayscale = 0.2989 * r.mean() + 0.587 * g.mean() + 0.114 * b.mean()
Transforming an image and then calculating the grayscale mean, gives the same result as first calculating the RGB means and then transforming those to a grayscale mean.
However, I am clueless when it comes to calculating the variance or standard deviation now. Does somebody have any idea, or knows some good literature on the topic? Is this even possible?
I found a publication "Jianxin Gong - Clarifying the Standard Deviational Ellipse" ... There he does it in 2 dimensions (as far as I understand). I just could not figure out yet how to do it in 3D.
Upvotes: 0
Views: 3075
Reputation: 1
To compute the standard deviation for grayscale ImageNet pixel values, you can indeed use the conversion formula you mentioned. The conversion from RGB to grayscale is performed using the formula:
{grayscale} = 0.2989 X R + 0.587 X G + 0.114 X B
You successfully calculated the mean grayscale pixel value using:
meanGrayscale = 0.2989 * r.mean() + 0.587 * g.mean() + 0.114 * b.mean()
Now, for calculating the standard deviation, we can use the properties of variance. The standard deviation is the square root of the variance, which can be calculated as follows:
Calculate the Variance: The variance of the grayscale values can be derived from the variances of the RGB channels, applying the conversion coefficients. Given the formula for grayscale, the variance is given by:
Where R, G, and B symbols are the variances of the respective RGB channels.
Calculate the Standard Deviation: Finally, the standard deviation of the grayscale image can be computed as:
Here’s how you can implement this in Python using PyTorch:
import torch
# Assuming r, g, b are your RGB channel tensors
r = torch.randn(1000) # Example tensor for red channel
g = torch.randn(1000) # Example tensor for green channel
b = torch.randn(1000) # Example tensor for blue channel
# Calculate means
mean_r = r.mean()
mean_g = g.mean()
mean_b = b.mean()
# Calculate variances
var_r = r.var(unbiased=False) # Population variance
var_g = g.var(unbiased=False)
var_b = b.var(unbiased=False)
# Calculate grayscale mean
meanGrayscale = 0.2989 * mean_r + 0.587 * mean_g + 0.114 * mean_b
# Calculate grayscale variance
var_gray = (0.2989**2 * var_r) + (0.587**2 * var_g) + (0.114**2 * var_b)
# Calculate grayscale standard deviation
std_gray = var_gray.sqrt()
print(f"Mean Grayscale Value: {meanGrayscale.item()}")
print(f"Standard Deviation of Grayscale Values: {std_gray.item()}")
This will give you the mean and standard deviation of the grayscale ImageNet pixel values that you can use for normalization in your model.
Upvotes: -1
Reputation: 131
Okay, I wasn't able to calculate the standard deviation as planned, but did it using the code below. The grayscale imagenet's train dataset mean and standard deviation are (round it as much as you like):
Mean: 0.44531356896770125
Standard Deviation: 0.2692461874154524
import multiprocessing
import os
def calcSTD(d):
meanValue = 0.44531356896770125
squaredError = 0
numberOfPixels = 0
for f in os.listdir("/home/imagenet/ILSVRC/Data/CLS-LOC/train/"+str(d)+"/"):
if f.endswith(".JPEG"):
image = imread("/home/imagenet/ILSVRC/Data/CLS-LOC/train/"+str(d)+"/"+str(f))
###Transform to gray if not already gray anyways
if np.array(image).ndim == 3:
matrix = np.array(image)
blue = matrix[:,:,0]/255
green = matrix[:,:,1]/255
red = matrix[:,:,2]/255
gray = (0.2989 * red + 0.587 * green + 0.114 * blue)
else:
gray = np.array(image)/255
###----------------------------------------------------
for line in gray:
for pixel in line:
squaredError += (pixel-meanValue)**2
numberOfPixels += 1
return (squaredError, numberOfPixels)
a_pool = multiprocessing.Pool()
folders = []
[folders.append(f.name) for f in os.scandir("/home/imagenet/ILSVRC/Data/CLS-LOC/train") if f.is_dir()]
resultStD = a_pool.map(calcSTD, folders)
StD = (sum([intensity[0] for intensity in resultStD])/sum([pixels[1] for pixels in resultStD]))**0.5
print(StD)
During the process some errors like this occured:
/opt/conda/lib/python3.7/site-packages/PIL/TiffImagePlugin.py:771: UserWarning: Possibly corrupt EXIF data. Expecting to read 8 bytes but only got 4. Skipping tag 41486 "Possibly corrupt EXIF data. "
The repective images from the 2019 version of ImageNet were skipped.
Upvotes: 4