abhiieor
abhiieor

Reputation: 3554

Getting multiple values from Python scipy.stats ppf function

For my dataset I am able to fit best distribution using scipy.stats functions. For one instance best distribution is:

In[94]: best_dist
Out[94]: <scipy.stats._continuous_distns.chi_gen at 0x119649cd0>

In[95]: best_fit_params
Out[95]: 
(0.40982879700171049,
 0.10387428783818109,
 -4.5566762564110859e-19,
 0.89837054605455657)

Now I am trying to get value corresponding to 95% area of CDF curve using ppf function. Which gives:

In[96]: best_dist.ppf(0.95,best_fit_params)
Out[96]: array([ 1.44854045,  0.74815691, nan,  1.89330302])

I am not able to understand why there are array of length 4 is getting returned when I only expecting one value? If one of these is my answer then which is that one?

Upvotes: 1

Views: 746

Answers (1)

ev-br
ev-br

Reputation: 26090

The correct usage is to unpack your best_fit_param:

In [1]: param = (0.40982879700171049,
   ...:  0.10387428783818109,
   ...:  -4.5566762564110859e-19,
   ...:  0.89837054605455657)

In [2]: from scipy.stats import beta

In [3]: beta.ppf(0.95, *param)     # notice the asterisk
Out[3]: 0.89837054605311872

Explanation: beta.shapes is "a, b", so the signature of beta.ppf is actually ppf(self, q, a, b, loc=0, scale=1). Your best_fit_param is a tuple of four values, for a, b, loc and scale, respectively.

I'm not sure about your In[96] however. In any reasonably recent scipy install calling ppf with two arguments should error out, I'd think (because it needs at least three: one for q and two more for a and b).

Upvotes: 2

Related Questions