sato
sato

Reputation: 862

Generating random numbers from custom continuous probability density function

I am trying to generate random numbers from a custom continuous probability density function, which is:

0.001257 *x^4 * e^(-0.285714 *x)

to do so, I use (on python 3) scipy.stats.rv_continuous and then rvs() to generate them

from decimal import Decimal
from scipy import stats
import numpy as np

class my_distribution(stats.rv_continuous):
    def _pdf(self, x):
        return (Decimal(0.001257) *Decimal(x)**(4)*Decimal(np.exp(-0.285714 *x)))

distribution = my_distribution()
distribution.rvs()

note that I used Decimal to get rid of an OverflowError: (34, 'Result too large').

Still, I get an error RuntimeError: Failed to converge after 100 iterations.

What's going on there? What's the proper way to achieve what I need to do?

Upvotes: 1

Views: 1811

Answers (1)

Peter O.
Peter O.

Reputation: 32908

I've found out the reason for your issue.

rvs by default uses numerical integration, which is a slow process and can fail in some cases. Your PDF is presumably one of those cases, where the left side grows without bound.

For this reason, you should specify the distribution's support as follows (the following example shows that the support is in the interval [-4, 4]):

distribution = my_distribution(a = -4, b = 4)

With this interval, the PDF will be bounded from above, allowing the integration (and thus the random variate generation) to work as normal. Note that by default, rv_continuous assumes the distribution is supported on the entire real line.

However, this will only work for the particular PDF you give here, not necessarily for arbitrary PDFs.


Usually, when you only give a PDF to your rv_continuous subclass, the subclass's rvs, mean, etc. Will then be very slow, because the method needs to integrate the PDF every time it needs to generate a random variate or calculate a statistic. For example, random variate generation requires using numerical integration to integrate the PDF, and this process can fail to converge depending on the PDF.

In future cases when you're dealing with arbitrary distributions, and particularly when speed is at a premium, you will thus need to add an _rvs method that uses its own sampler. One example is a much simpler rejection sampler given in the answer to a related question.

See also my section "Sampling from an Arbitrary Distribution".

Upvotes: 3

Related Questions