HJA24
HJA24

Reputation: 363

Get shape parameters of scipy.stats.beta from shape of PDF

I have the following script from the docs

import numpy as np
import scipy.stats as ss
import matplotlib.pyplot as plt

fig, ax = plt.subplots(1)


a = 1.25
b = 1.5

x = np.linspace(ss.beta.ppf(0.00, a, b),
                ss.beta.ppf(1, a, b), 100)

ax.plot(x, ss.beta.pdf(x, a, b),
       'r-', lw=5, alpha=0.6, label='beta pdf')

plt.show()

I would like to reverse the process and get a and b from the shape of the pdf. I will only have the output of

ss.beta.pdf(x, a, b)

where only x is known (specified earlier). So how do I get a and b if I only have the following data:

[0.         0.63154416 0.74719517 0.82263393 0.87936158 0.92490497
 0.96287514 0.9953117  1.02349065 1.04826858 1.07025112 1.08988379
 1.10750482 1.12337758 1.13771153 1.15067624 1.16241105 1.17303196
 1.18263662 1.19130806 1.19911747 1.20612637 1.21238825 1.21794995
 1.22285265 1.22713276 1.23082258 1.23395089 1.23654342 1.23862319
 1.24021091 1.24132519 1.2419828  1.2421989  1.24198713 1.24135982
 1.24032811 1.23890203 1.23709059 1.23490192 1.23234325 1.22942106
 1.22614106 1.2225083  1.21852712 1.21420131 1.209534   1.20452779
 1.19918472 1.19350628 1.18749347 1.18114672 1.17446601 1.16745076
 1.16009989 1.1524118  1.14438438 1.13601495 1.12730028 1.11823656
 1.10881937 1.09904366 1.0889037  1.07839304 1.06750447 1.05622997
 1.0445606  1.03248649 1.01999669 1.00707911 0.99372038 0.97990571
 0.96561876 0.95084139 0.93555349 0.91973269 0.90335407 0.88638972
 0.86880835 0.85057469 0.83164884 0.81198535 0.79153224 0.77022959
 0.74800782 0.7247854  0.70046587 0.67493372 0.64804878 0.61963821
 0.58948474 0.55730896 0.52274112 0.48527403 0.44417862 0.39833786
 0.34587496 0.2831383  0.20072303 0.        ]

After finding this answer and this blogpost I have an idea of what needs to be done:

def find_parameters(x1, p1, x2, p2):
    """Find parameters for a beta random variable X
    so that P(X > x1) = p1 and P(X > x2) = p2.
    """

    def objective(v):
        (a, b) = v
        temp = (ss.beta.cdf(x1, a, b) - p1) ** 2.0
        temp += (ss.beta.cdf(x2, a, b) - p2) ** 2.0
        return temp

    # arbitrary initial guess of (1, 1) for parameters
    xopt = ss.optimize.fmin(objective, (1, 1))
    return [xopt[0], xopt[1]]

However, this function only uses two points. I modified the solution as follows:

def find_parameters(x, p):

    def objective(v):
        (a, b) = v
        return np.sum((ss.beta.cdf(x, a, b) - p) ** 2.0)

    # arbitrary initial guess of (1, 1) for parameters
    xopt = optimize.fmin(objective, (1, 1))
    return [xopt[0], xopt[1]]

fitted_a, fitted_b = find_parameters(x=x, p=ss.beta.pdf(x, a, b))

This results in the following incorrect values:

a: 4.2467303147231366e-10

and

b: 1.9183210434237443

Please advice

Upvotes: 1

Views: 131

Answers (1)

foglerit
foglerit

Reputation: 8279

Your objective function is calculating the error to a Beta CDF (ss.beta.cdf(x, a, b)) but you are trying to fit a Beta PDF. Make this simple change and your code will work:

def find_parameters(x, p):

    def objective(v):
        (a, b) = v
        return np.sum((ss.beta.pdf(x, a, b) - p) ** 2.0)

    # arbitrary initial guess of (1, 1) for parameters
    xopt = optimize.fmin(objective, (1, 1))
    return [xopt[0], xopt[1]]

Upvotes: 1

Related Questions