Reputation: 5286
take a look at this example:
import matplotlib.pyplot as plt
l = [3,3,3,2,1,4,4,5,5,5,5,5,5,5,5,5]
plt.hist(l,normed=True)
plt.show()
The output is posted as a picture. I have two questions:
a) Why are only the 4 and 5 bins centered around its value? Shouldn't the others be that as well? Is there a trick to get them centered?
b)Why are the bins not normalised to proportion? I want the y values of all the bins to sum up to one.
Note that my real example contains much more values in the list, but they are all discrete.
Upvotes: 11
Views: 16916
Reputation: 18521
You should adjust the keyword arguments of the plt.hist
function. There are many of them and the documentation can help you answer many of these questions.
a. ) You can pass the keywords bins=range(1,7)
and align=left
. Setting the bins
keyword to a sequence gives the borders of each bin. For example, [1,2], [2,3], [3,4], ..., [5, 6]
.
b. ) Check your bin widths (rwidth!=1
). From the matplotlib.pyplot.hist
documentation:
If True, the first element of the return tuple will be the counts normalized to form a probability density, i.e., n/(len(x)*dbin). In a probability density, the integral of the histogram should be 1; you can verify that with a trapezoidal integration of the probability density function:
This means that the area under your bins is summing up to one, but because the bin widths are less than 1, the heights get normalized in such a way that the heights don't add up to 1. If you adjust rwidth=1
, you get a good looking plot:
plt.hist(l, bins=range(1,7), align='left', rwidth=1, normed=True)
Upvotes: 16