Denis
Denis

Reputation: 3707

Why stats.rv_continuous returns the same value all time?

I have the following code snippet:

from scipy import stats

class my_distribution(stats.rv_continuous):
    def __init__(self):
        super().__init__(a=0, b=1)

    def _cdf(self, x):
        return 0.2 * log(x)


def main():
    distribution = my_distribution()

    val = [distribution.rvs() for i in range(10000)]

    sum(val) == 10000 # why !?

It is interesting, that for other function (uniform distribution, for example), I get different random values.

Upvotes: 1

Views: 352

Answers (1)

ev-br
ev-br

Reputation: 26030

In [24]: class distr_gen(stats.rv_continuous):
   ....:     def _pdf(self, x):
   ....:         return 1./(1.2*x)**0.8
   ....:     

In [25]: d = distr_gen(a=0., b=1., name='xxx')
In [26]: d.rvs(size=10)
Out[26]: 
array([  2.41056898e-05,   6.05777448e-04,   7.62206590e-06,
         1.46271162e-07,   1.49455630e-05,   6.84527767e-05,
         1.62679847e-04,   1.28736701e-05,   4.59315246e-05,
         4.15976052e-05])

There are several problems with the code in your OP:

  1. The cdf does not correspond to the pdf
  2. cdf(lower bound) should be 0, cdf(upper bound) should be 1. Which is not the case for your formula.

With the pdf this simple, you're probably best off correcting the error in the integration for cdf and inverting the cdf on a piece of paper. Then add it to your class as a ppf method. Or if all you need is random sampling, just generate a bunch of uniform random numbers and transform them according to the ppf you've calculated.

Upvotes: 4

Related Questions