Reputation: 387
I'm trying to perform a Kolmogorov-Smirnov test to compare an empirical distribution with the F distribution (I know these can't be compared directly, but I will employ bootstrapping). I'm having a problem with the scipy KS test:
readLengths = [list,of,int,values,...]
x = stats.f.fit(readLengths)
dfn=x[0]
dfd=x[1]
stats.kstest(readLengths,stats.f.rvs(dfn,dfd,size=100))
I am getting the error
TypeError: 'numpy.ndarray' object is not callable
and it points to the stats.kstest line. I assume this is a problem with the readLengths array, but the docs say it can take a 1D array, so not sure why I'm having this problem. Also, interestingly in this function you can name the normal distribution with 'norm', but 'f' doesn't appear to be valid, despite that being the scipy name for the F distribution.
Upvotes: 5
Views: 2303
Reputation: 74242
From the docs:
cdf : str or callable
If a string, it should be the name of a distribution in scipy.stats. If rvs is a string then cdf can be False or the same as rvs. If a callable, that callable is used to calculate the cdf.
The second argument to kstest
should either be a string or a callable object that accepts quantiles as inputs and returns the CDF. Instead you are passing it
stats.f.rvs(dfn,dfd,size=100)
which evaluates to an np.ndarray
.
One option would be to construct a frozen PDF using your desired parameters, then pass its .cdf
method as the second argument to kstest
:
fdist = stats.f(dfn, dfd)
d, p = stats.kstest(readLengths, fdist.cdf)
Upvotes: 3