Reputation: 1785
I'm trying to run the Wilcoxon test both in R and in python's scipy.stats package. However I get different results can anyone explain?
My Code in R
> des2
[1] 6.2151308 4.7956451 4.7473738 5.4695828 6.3181463 2.8617239
[7] -0.8105824 3.9456856 4.6735000 4.1067193 5.7656002 2.2237666
[13] 1.0354143 4.9547707 5.3156348 4.8163154 3.4024776 4.2876854
[19] 6.1227500
> wilcox.test(des2, mu=0, conf.int = T)
Wilcoxon signed rank test
data: des2
V = 189, p-value = 7.629e-06
alternative hypothesis: true location is not equal to 0
95 percent confidence interval:
3.485570 5.160925
sample estimates:
(pseudo)median
4.504883
my code in Python:
test = [6.2151308, 4.7956451, 4.7473738, 5.4695828, 6.3181463, 2.8617239, -0.8105824, 3.9456856, 4.6735000, 4.1067193, 5.7656002, 2.2237666, 1.0354143, 4.9547707, 5.3156348, 4.8163154, 3.4024776, 4.2876854, 6.1227500]
z_statistic, p_value = wilcoxon(np.array(test) - np.log(1.0))
print "one-sample wilcoxon-test", p_value
one-sample wilcoxon-test 0.000155095772796
Even though the p-value for both of them is low enough to reject the null hypothesis, the p-value have a difference of 3 order of magnitude and I can't understand why
Upvotes: 2
Views: 2796
Reputation: 31399
scipy
's implementation always uses a normal approximation when calculating the p-value. While this certainly works for large sample size n
, the p-value can deviate from the true p-value for small sample sizes.
In the notes of scipy
's docs you find:
Because the normal approximation is used for the calculations, the samples used should be large. A typical rule is to require that n > 20.
R
's implementation calculates an exact p-value for small sample size and uses the normal approximation only for sufficiently large n
.
R's docs tell you:
By default (if exact is not specified), an exact p-value is computed if the samples contain less than 50 finite values and there are no ties. Otherwise, a normal approximation is used.
So in short: When the two p-values differ, R
's p-value should be preferred.
Upvotes: 2