Reputation: 109
I am attempting to run a Kolmogorov-Smirnoff test using the ks_2samp function from scipy to determine if histograms of data are from the same distribution. The returned p-value doesn't seem quite right sometimes though...
For example with this histogram:
aa, bb, cc = ax1.hist(list1, numpy.arange(a-1, b+3, c), alpha = .5, align = 'mid', rwidth=1, linestyle = 'dashed', linewidth = 1.5)
dd, ee, ff = ax1.hist(list2, numpy.arange(a-1, b+3, c), alpha = .5, align = 'mid',rwidth=1)
print ks_2samp(aa, dd)`[1]`
I get a p-value returned of about .96, which really don't seem right...am I doing something wrong? Shouldn't these histograms be different enough that the p-value would be lower?
Upvotes: 1
Views: 881
Reputation: 31339
ks_2samp
applies the Kolmogorov-Smirnov test to two samples and tests the null hypothesis that both come from the same distribution.
Therefore ks_2samp
also takes the two samples (here list1
and list2
) as input.
ks_2samp(list1, list2)
Upvotes: 3