difference between Wilcoxon test in R and Python

Question

I'm trying to run the Wilcoxon test both in R and in python's scipy.stats package. However I get different results can anyone explain?

My Code in R

    > des2
 [1]  6.2151308  4.7956451  4.7473738  5.4695828  6.3181463  2.8617239
 [7] -0.8105824  3.9456856  4.6735000  4.1067193  5.7656002  2.2237666
[13]  1.0354143  4.9547707  5.3156348  4.8163154  3.4024776  4.2876854
[19]  6.1227500
> wilcox.test(des2, mu=0, conf.int = T)

    Wilcoxon signed rank test

data:  des2
V = 189, p-value = 7.629e-06
alternative hypothesis: true location is not equal to 0
95 percent confidence interval:
 3.485570 5.160925
sample estimates:
(pseudo)median 
      4.504883

my code in Python:

test = [6.2151308, 4.7956451,  4.7473738,  5.4695828,  6.3181463,  2.8617239, -0.8105824, 3.9456856,  4.6735000,  4.1067193, 5.7656002, 2.2237666, 1.0354143, 4.9547707, 5.3156348,  4.8163154,  3.4024776,  4.2876854,  6.1227500]
z_statistic, p_value = wilcoxon(np.array(test) - np.log(1.0))
print "one-sample wilcoxon-test", p_value


one-sample wilcoxon-test 0.000155095772796

Even though the p-value for both of them is low enough to reject the null hypothesis, the p-value have a difference of 3 order of magnitude and I can't understand why

cel · Accepted Answer

scipy's implementation always uses a normal approximation when calculating the p-value. While this certainly works for large sample size n, the p-value can deviate from the true p-value for small sample sizes.

In the notes of scipy's docs you find:

Because the normal approximation is used for the calculations, the samples used should be large. A typical rule is to require that n > 20.

R's implementation calculates an exact p-value for small sample size and uses the normal approximation only for sufficiently large n.

R's docs tell you:

By default (if exact is not specified), an exact p-value is computed if the samples contain less than 50 finite values and there are no ties. Otherwise, a normal approximation is used.

So in short: When the two p-values differ, R's p-value should be preferred.

difference between Wilcoxon test in R and Python

Answers (1)

Related Questions