Brian
Brian

Reputation: 14836

How to normalize an histogram

I have this histogram which counts the array "d" in equally log-spaced bins.

 max_val=np.log10(max(d))
 min_val=np.log10(min(d))
 logspace = np.logspace(min_val, max_val, 50) 


 hist(d,bins=logspace,label='z='+str(redshift),histtype='step')
 show()

The problem is that I want it to be normalized so as the area is one. Using the option Normed=True I didn't get the result, it might be due to fact that I'm using logarithmic bins. Therefore I tried normalizing the histogram in this way:

 H=hist(d,bins=logspace,label='z='+str(redshift),histtype='step')
 H_norm=H[0]/my_norm_constant

But then I don't know how to plot H_norm versus the bins

Upvotes: 1

Views: 15467

Answers (2)

user6614752
user6614752

Reputation: 11

This uses the common normalization which normalizes bin height to add up to 1 irrespective of bin width.

import matplotlib
import numpy as np

x = [0.1,0.2,0.04,0.05,0.05,0.06,0.07,0.11,0.12,0.12,0.1414,\
     0.1415,0.15,0.12,0.123,0,0.14,0.145,0.15,0.156,0.12,0.15,\
     0.156,0.166,0.151,0.124, 0.12,0.124,0.12,0.045,0.124]

weights = np.ones_like(x)/float(len(x))
p=plt.hist(x,
    bins=4,
    normed=False, 
    weights=weights,
    #histtype='stepfilled',
    color=[0.1,0.4,0.3]
)

plt.ylim(0,1)
plt.show()

resulting histogram plot:

Upvotes: 1

HYRY
HYRY

Reputation: 97291

I tried normed=True, and the area is 1:

from pylab import *
d = np.random.normal(loc=20, size=10000)
max_val=np.log10(max(d))
min_val=np.log10(min(d))
logspace = np.logspace(min_val, max_val, 50) 


r = hist(d,bins=logspace,histtype='step', normed=True)
print "area":, sum(np.diff(r[1])*r[0])

can you run the code, and check the output. If it is not 1, check your numpy version. I got this warning message when run the code:

C:\Python26\lib\site-packages\matplotlib\axes.py:7680: UserWarning: This release fixes a normalization bug in the NumPy histogram function prior to version 1.5, occuring with non-uniform bin widths. The returned and plotted value is now a density: n / (N * bin width), where n is the bin count and N the total number of points.

to plot the graph yourself:

step(r[1][1:], r[0]/my_norm_constant)

Upvotes: 3

Related Questions