Reputation: 159
This is something I've been confused about for a while and I was hoping for some assistance.
I'm trying to use scipy.stats.kstest
to test my distribution against another distribution which is simply x=y
so that I can get a p-value. In the examples online it gives something like:
>>> x = np.linspace(-15, 15, 9)
>>> scipy.stats.kstest(x, 'norm')
(0.44435602715924361, 0.038850142705171065)
but I'm not sure how I can modify the expected distribution from norm
to x=y
? Also, my 'real' distribution has both x and y values (it's a cdf of a uniform distribution). How would I plug it into this?
Upvotes: 1
Views: 200
Reputation: 40878
It seems like you're looking for scipy.stats.ks_2samp
:
This is a two-sided test for the null hypothesis that 2 independent samples are drawn from the same continuous distribution.
import numpy as np
from scipy import stats
np.random.seed(123)
# Draw random samples from two normal distributions
# with different means/stdevs. The resulting pvalue
# be low (high significance/reject the null).
rvs1 = stats.norm.rvs(size=400, loc=0., scale=1)
rvs2 = stats.norm.rvs(size=400, loc=0.5, scale=1.5)
p_lo = stats.ks_2samp(rvs1, rvs2)[1]
print(p_lo)
# 1.29793098188e-10
# Same test for two random samples drawn from same distribution
# should yield high p value.
rvs3 = stats.norm.rvs(size=400, loc=0.01, scale=1)
p_hi = stats.ks_2samp(rvs1, rvs3)[1]
print(p_hi)
# 0.855599637503
Upvotes: 1