Jean-Paul Azzopardi
Jean-Paul Azzopardi

Reputation: 379

Calculating probability distribution of an image?

I want to find the probability distribution of two images so I can calculate KL Divergence.

I'm trying to figure out what probability distribution means in this sense. I've converted my images to grayscale, flattened them to a 1d array and plotted them as a histogram with bins = 256

imageone = imgGray.flatten() # array([0.64991451, 0.65775765, 0.66560078, ..., 
imagetwo = imgGray2.flatten()

plt.hist(imageone, bins=256, label = 'image one') 
plt.hist(imagetwo, bins=256, alpha = 0.5, label = 'image two')
plt.legend(loc='upper left')

My next step is to call the ks_2samp function from scikit to calculate the divergence, but I'm unclear what arguments to use.

A previous answer explained that we should take the "take the histogram of the image(in gray scale) and than divide the histogram values by the total number of pixels in the image. This will result in the probability to find a gray value in the image."

Ref: Can Kullback-Leibler be applied to compare two images?

But what do we mean by take the histogram values? How do I 'take' these values?

Might be overcomplicating things, but confused by this.

Upvotes: 0

Views: 1701

Answers (1)

Matt Pitkin
Matt Pitkin

Reputation: 6532

The hist function will return 3 values, the first of which is the values (i.e., number counts) in each histogram bin. If you pass the density=True argument to hist, these values will be the probability density in each bin. I.e.,:

prob1, _, _ = plt.hist(imageone, bins=256, density=True, label = 'image one') 
prob2, _, _ = plt.hist(imagetwo, bins=256, density=True, alpha = 0.5, label = 'image two')

You can then calculate the KL divergence using the scipy entropy function:

from scipy.stats import entropy

entropy(prob1, prob2)

Upvotes: 2

Related Questions