Reputation: 3554
For my dataset I am able to fit best distribution using scipy.stats functions. For one instance best distribution is:
In[94]: best_dist
Out[94]: <scipy.stats._continuous_distns.chi_gen at 0x119649cd0>
In[95]: best_fit_params
Out[95]:
(0.40982879700171049,
0.10387428783818109,
-4.5566762564110859e-19,
0.89837054605455657)
Now I am trying to get value corresponding to 95% area of CDF curve using ppf
function. Which gives:
In[96]: best_dist.ppf(0.95,best_fit_params)
Out[96]: array([ 1.44854045, 0.74815691, nan, 1.89330302])
I am not able to understand why there are array of length 4 is getting returned when I only expecting one value? If one of these is my answer then which is that one?
Upvotes: 1
Views: 746
Reputation: 26090
The correct usage is to unpack your best_fit_param
:
In [1]: param = (0.40982879700171049,
...: 0.10387428783818109,
...: -4.5566762564110859e-19,
...: 0.89837054605455657)
In [2]: from scipy.stats import beta
In [3]: beta.ppf(0.95, *param) # notice the asterisk
Out[3]: 0.89837054605311872
Explanation: beta.shapes
is "a, b"
, so the signature of beta.ppf
is actually ppf(self, q, a, b, loc=0, scale=1)
. Your best_fit_param
is a tuple of four values, for a, b, loc and scale, respectively.
I'm not sure about your In[96]
however. In any reasonably recent scipy install calling ppf with two arguments should error out, I'd think (because it needs at least three: one for q
and two more for a
and b
).
Upvotes: 2