Reputation: 555
I am wondering why my answers are so different when doing a mann whitney u test in python and in R. In python:
from scipy.stats import mannwhitneyu
t = [1,2,3]
g = [4,5,6,7,8,9]
mannwhitneyu(t,g)
(0.0, 0.014092901073953692)
In R:
t = c(1,2,3)
g = c(4,5,6,7,8,9)
wilcox.test(t,g, paired = FALSE)
Wilcoxon rank sum test
data: t and g
W = 0, p-value = 0.02381
alternative hypothesis: true location shift is not equal to 0
I'm wondering why the python one looks more like a one sided test.
Upvotes: 4
Views: 2863
Reputation: 11
To align the results of the Mann-Whitney U test between R and Python, ensure you specify a two-sided test and use exact calculations in Python's mannwhitneyu()
function.
mannwhitneyu(t,g, alternative='two-sided', method='exact')
Upvotes: 1
Reputation: 23
MW Test in scipy is not to be applied for sample sizes less than 20. See note in its documentation. Hence your python results are not accurate.
From the link below
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html
" Notes
Use only when the number of observation in each sample is > 20 and you have 2 independent samples of ranks. Mann-Whitney U is significant if the u-obtained is LESS THAN or equal to the critical value of U.
This test corrects for ties and by default uses a continuity correction. "
Upvotes: 1
Reputation: 251498
The scipy version is documented to return a one-sided p-value. (The doc site is down for me at the moment so I can't provide a link, but you can see it if you look at the help for the mannwhitneyu
function.) The R function is documented to allow you to specify the sidedness, with two-sided as the default.
Upvotes: 8