Reputation: 43
What is the Python equivalent of the Beta distribution in Excel? In Excel, the formula is:
=BETA.DIST(A2,A3,A4,FALSE,A5,A6).
This gives the Beta probability density function for the given parameters, and we get the result as some decimal value.
But the Python SciPy reference does not give the function parameters and its definition in a similar form as Excel.
I am not getting how to do this in SciPy and pass the parameter correctly.
Upvotes: 4
Views: 2432
Reputation: 114811
For the Excel BETA.DIST
function with signature
BETA.DIST(x,alpha,beta,cumulative,[A],[B])
with cumulative = FALSE
, use the function scipy.stats.beta.pdf
as follows:
from scipy import stats
p = stats.beta.pdf(x, alpha, beta, loc=A, scale=B-A)
In other words, set loc
to be the lower bound of the support interval [A, B], and set scale
to the length of the interval.
For example, the documentation for BETA.DIST
includes the example
=BETA.DIST(A2,A3,A4,FALSE,A5,A6)
where A2=2
, A3=8
, A4=10
, A5=1
and A6=3
. The value of the function is reported to be 1.4837646
. The corresponding expression using scipy is:
In [59]: from scipy import stats
In [60]: x = 2
In [61]: alpha = 8
In [62]: beta = 10
In [63]: a = 1
In [64]: b = 3
In [65]: stats.beta.pdf(x, alpha, beta, loc=a, scale=b-a)
Out[65]: 1.4837646484375009
For the case cumulative=TRUE
, use the function scipy.stats.beta.cdf
. The same example given above reports the value of the PDF to be 0.6854706
. Here's the calculation using scipy:
In [66]: stats.beta.cdf(x, alpha, beta, loc=a, scale=b-a)
Out[66]: 0.6854705810546875
Upvotes: 0
Reputation: 710
as you can see here, the probability density function of the beta distribution in scipy has exactly the same three parameters as excel (excel docs).
ALPHA
is equivalent to a
and represents a parameter of the distribution.
BETA
is equivalent to b
and represents a parameter of the distribution.
X
is equivalent to x
and value at which the distribution should be evaluated.
Weather or not the Cumulativ
parameter in excel is True is represented by calling different functions in scipy. If you want the cumulative distribution (Cumulativ = True
) you just call myBeta.cdf(<myParams>)
, if you want the probability density function (Cumulativ = False
) you call myBeta.pdf(<myParams>)
.
That means:
BETA.DIST(X,Alpha,Beta,TRUE) <=>
scipy.stats.beta.cdf(x,a,b)
and
BETA.DIST(X,Alpha,Beta,FALSE) <=>
scipy.stats.beta.pdf(x,a,b)
Upvotes: 4