Reputation: 129
I want to create a Poisson distribution with mean is 2 number of elements is 10000 has min value 1 and tail value 140 so far I can only specify the min
stats.poisson.rvs( 2, loc = 1,size=10000)
and the following distribution is generated
How can I make it end at 140 instead of 11
Upvotes: 1
Views: 1690
Reputation: 17159
The chance that a sample out of a Poisson distribution with mean equal to 2 is greater than 140 is so small that you would not get one out of just 10000 samples.
Indeed, the Poisson distribution has only one parameter λ and a probability mass function defined so that
P(x=k) = λ^k * exp(-λ) / k!
The mean value is also equal to λ. If λ = 2 then
P(x=140) = 7.7e-199
so if there are 10000 samples the chance that there would be at least one sample at 140 out of 10000 would be less than 7.7e-195. This is a number so small that you cannot expect this to occur in a lifetime.
It is a bit harder to compute the probability that a sample out of Poisson distribution with λ=2 lies above 140. You can use scipy.stats.poisson.cdf
to see that
P(x>=22) = 1 - scipy.stats.poisson.cdf(21,2) = 5.5e-16
Therefore even the chance that you would have one sample out of 10000 above 21 is less than 5.5e-12. Computing P(x>=140)
in the same way would return 0 because of floating point rounding in intermediate results.
Conclusion
If you want distribution mean equal 2.0, and a heavy tail reaching up to 140 on 10000 samples you need a distribution different from Poisson. You could consider Pareto distribution, scipy.stats.pareto
with parameter b = 2.
Here is a comparison of 10000 random samples from
scipy.stats.poisson.rvs(2,size=10000)
and
numpy.rint(scipy.stats.pareto.rvs(2,size=10000))
It is clearly visible that Pareto distribution with the same mean has a much heavier tail.
For reference the code for the plot is below
import matplotlib.pyplot as plt
import scipy.stats
import numpy as np
pareto_x = np.rint(scipy.stats.pareto.rvs(2,size=10000))
poisson_x = scipy.stats.poisson.rvs(2,size=10000)
plt.figure(figsize=(8,4))
plt.subplot(121)
plt.title("Poisson distribution, a = 2")
plt.xlabel("sample number")
plt.ylabel("sample value")
plt.axis([0,10000,0,180])
plt.plot(range(0,10000),poisson_x,"o")
plt.subplot(122)
plt.axis([0,10000,0,180])
plt.title("Pareto distribution, b = 2")
plt.xlabel("sample number")
plt.plot(range(0,10000),pareto_x,"o")
plt.subplots_adjust(hspace=0.4,bottom=0.2)
plt.savefig("poisson_pareto.png")
Upvotes: 4