Yoann Pageaud
Yoann Pageaud

Reputation: 519

How to get tabulated interval of Wilcoxon-Mann-Whitney rank sum test

I was reading this topic on Rbloggers about the use of the Wilcoxon rank sum test: https://www.r-bloggers.com/wilcoxon-mann-whitney-rank-sum-test-or-test-u/

Especially this part, here I quote:

"We can finally compare the intervals tabulated on the tables of Wilcoxon for independent samples. The tabulated interval for two groups of 6 samples each is (26, 52)".

How can I get these "tabulated" values ?
I understand they used a table where the values are reported following the size of each samples, but I was wondering if there was a way to get them in R.

It is important because as I can understand the post, once you have a p-value > 0.05 and so cannot reject the null hypothesis H0, you can actually confirm H0 by comparing "computed" and "tabulated" intervals.

So what I would need is the tabulated intervals, using R.

Upvotes: 0

Views: 774

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226712

tl;dr

  1. You can get confidence intervals for a Mann-Whitney-Wilcoxon test by specifying conf.int=TRUE.

  2. Don't believe everything you read on the internet ...

    • If by "confirm" you mean "make sure that the computation is true", you don't need to double-check by consulting the original tables; the p-value should be enough to decide whether you can reject H0 or not. You can trust R for standard, widely used statistical methods. (I also show below how to repeat the computation with a different implementation from the coin package, which is a nearly independent check.)
    • if by "confirm" you mean "accept the null hypothesis", please don't do this; this is a fundamental violation of frequentist statistical theory, which says that you can reject a null hypothesis, but that you can never accept the null. Wide confidence intervals and p-values greater than a given threshold are evidence that the conclusion is uncertain (we can't be sure whether the null or the alternative is true), not that the null is true. The concluding text of the blog post referred to ("we conclude by accepting the hypothesis H0 of equality of means") is statistically incorrect.

A better way to interpret the uncertainty is to look at the confidence intervals. You can compute these for the Wilcoxon test: from ?wilcox.test:

... (if argument ‘conf.int’ is true [and a two-sample test is being performed]), a nonparametric confidence interval and an estimator for ... the difference of the location parameters ‘x-y’ is computed.

> a = c(6, 8, 2, 4, 4, 5)
> b = c(7, 10, 4, 3, 5, 6)
> wilcox.test(b,a, conf.int=TRUE, correct=FALSE)
data:  b and a
W = 22, p-value = 0.5174
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
 -1.999975  4.000016
sample estimates:
difference in location 
             0.9999395 

The high p-value (0.5174) says that we really can't tell whether the values in a or b have signicantly different ranks. The difference in location gives us the estimated difference between the median ranks, and the confidence interval gives the confidence interval on this difference. In this case, for a sample size of 12, the estimated difference in ranks is 1 (group b has slightly higher ranks than group a), and the confidence interval is (-2, 4) (the data are consistent with group b having slightly lower or much higher ranks than group a). It is admittedly rather difficult to interpret the substantive meaning of these values - that's one of the disadvantages of rank-based nonparametric tests ...

You can assume that the p-value computed by wilcox.test() is a reasonable summary of the evidence against the null hypothesis; there's no need to look up ranges in the tables. If you're worried about wilcox.test() in base R, you can try wilcox_test() from the coin package:

dd <- data.frame(f=rep(c("a","b"),each=6),x=c(a,b))
wilcox_test(x~f,data=dd,conf.int=TRUE) ## asymptotic test

which gives nearly identical results to wilcox.test(), and

 wilcox_test(x~f,data=dd,conf.int=TRUE, distribution="exact")

which gives a slightly different p-value, but essentially the same confidence intervals.

of historical interest only

As for the tables: I found them on Google books, by doing a Google Scholar search with author:katti author:wilcox. There you can read the description of how they were computed; this wouldn't be impossible to replicate, but it seems unnecessary since p-values and confidence intervals are available via other methods. Digging through you find this:

enter image description here

The number 0.0206 in the red box indicates that the interval (26,52) corresponds to a one-tail p-value of 0.0206 (2-tailed = 0.0412); that's the closest you can get with a discrete range. The next closest range is given in the line below [(27,51), one-tailed p=0.0325, two-tailed=0.065]. In the 21st century you should never have to do this procedure.

Upvotes: 3

Related Questions