Reputation: 3511
I want to draw a histogram and a line plot at the same graph. However, to do that I need to have my histogram as a probability mass function, so I want to have on the y-axis a probability values. However, I don't know how to do that, because using the normed
option didn't helped. Below is my source code and a sneak peek of used data. I would be very grateful for all suggestions.
data = [12565, 1342, 5913, 303, 3464, 4504, 5000, 840, 1247, 831, 2771, 4005, 1000, 1580, 7163, 866, 1732, 3361, 2599, 4006, 3583, 1222, 2676, 1401, 2598, 697, 4078, 5016, 1250, 7083, 3378, 600, 1221, 2511, 9244, 1732, 2295, 469, 4583, 1733, 1364, 2430, 540, 2599, 12254, 2500, 6056, 833, 1600, 5317, 8333, 2598, 950, 6086, 4000, 2840, 4851, 6150, 8917, 1108, 2234, 1383, 2174, 2376, 1729, 714, 3800, 1020, 3457, 1246, 7200, 4001, 1211, 1076, 1320, 2078, 4504, 600, 1905, 2765, 2635, 1426, 1430, 1387, 540, 800, 6500, 931, 3792, 2598, 5033, 1040, 1300, 1648, 2200, 2025, 2201, 2074, 8737, 324]
plt.style.use('ggplot')
plt.rc('xtick',labelsize=12)
plt.rc('ytick',labelsize=12)
plt.xlabel("Incomes")
plt.hist(data, bins=50, color="blue", alpha=0.5, normed=True)
plt.show()
Upvotes: 7
Views: 10307
Reputation: 75
This is old, but since I found it and was about to use it before I noticed some mistakes, I figured I'd add a comment for a couple of fixes I noticed. In the example @mmdanziger uses the bin edges in plt.bar
, however, you need to actually use the centers of the bin. Also they assume that the bins are of equal width, which is fine "most" of the time. But you can also pass it an array of widths, which keep you from inadvertently forgetting and making a mistake. So here's a more complete example:
import numpy as np
heights, bins = np.histogram(data, bins=50)
heights = heights/sum(heights)
bin_centers = 0.5*(bins[1:] + bins[:-1])
bin_widths = np.diff(bins)
plt.bar(bin_centers, heights, width=bin_widths, color="blue", alpha=0.5)
@mmdanziger other option of passing weights = np.ones_like(data)/len(data)
to plt.hist() also does the same thing, and for many is an easier approach.
Upvotes: 1
Reputation: 4658
As far as I know, matplotlib
does not have this function built-in. However, it is easy enough to replicate
import numpy as np
heights,bins = np.histogram(data,bins=50)
heights = heights/sum(heights)
plt.bar(bins[:-1],heights,width=(max(bins) - min(bins))/len(bins), color="blue", alpha=0.5)
Edit: Here is another approach from a similar question:
weights = np.ones_like(data)/len(data)
plt.hist(data, bins=50, weights=weights, color="blue", alpha=0.5, normed=False)
Upvotes: 11