Reputation: 1240
bins=np.arange(0,1,0.1)
discrete_pdf=np.power(bins,1.5)
discrete_pdf=discrete_pdf/np.sum(discrete_pdf)
print(bins)
print(discrete_pdf)
print(np.sum(discrete_pdf))
plt.plot(bins,discrete_pdf)
plt.show()
values=np.random.choice(bins, 100000, p=discrete_pdf)
plt.hist(values,10)
plt.show()
it's me not able to use hist
or is a "feature"/bug?
if you force the hist function to male 10*n bins (e.g. 20 or 100) the plot looks reasonable, but it has empty spaces due to finer binning.
Upvotes: 0
Views: 629
Reputation: 25362
This is because you are letting matplotlib automatically determine the bins for you by using plt.hist(values,10)
because the second argument is the number of bins. If we look at the automatic bins generated by matplotlib with a value of 10
for the number of bins they are:
[0.1 0.18 0.26 0.34 0.42 0.5 0.58 0.66 0.74 0.82 0.9 ]
You can pass in custom bins rather than letting matplotlib automatically decide them. Therefore the solution is to pass in a list (or array) of bins plotting_bins = np.arange(0,1.1,1)
noticing than an extra bin has been added on the end.
import numpy as np
import matplotlib.pyplot as plt
bins=np.arange(0,1,0.1)
discrete_pdf=np.power(bins,1.5)
discrete_pdf=discrete_pdf/np.sum(discrete_pdf)
values=np.random.choice(bins, 100000, p=discrete_pdf)
plotting_bins=np.arange(0,1.1,0.1) # need to add an extra bin when plotting
fig, (ax1,ax2) = plt.subplots(1,2,figsize=(6,4))
ax1.hist(values, 10)
ax1.set_title("Automatic bins")
ax2.hist(values, bins=plotting_bins)
ax2.set_title("Manual bins")
ax1.set_xlim(0,1)
ax2.set_xlim(0,1)
plt.tight_layout()
plt.show()
If you want to know how the automatic bins are created when you provide an integer for the bins argument we can look at the documentation of numpy.histogram()
(which is what plt.hist()
uses beind the scenes):
bins : int or sequence of scalars or str, optional
If bins is an int, it defines the number of equal-width bins in the given range
and
range : (float, float), optional
The lower and upper range of the bins. If not provided, range is simply (a.min(), a.max()).
Upvotes: 1