Julia
Julia

Reputation: 21

Same p-value error when using group_by to conduct wilcoxon one sample tests in R

I am working on a large dataset of offspring sex ratio from +36,000 individuals of over 1,000 species. I want to see if the median sex ratio of each species significantly differs from .5. I am using a one-sample wilcoxon to do this. Here is an example dataset:

n<-100
dat<-data.frame(species=rep(LETTERS[1:5],n/5), SR=sample((1:100)/100,n,replace=TRUE))

When I run the following code, I get results where all p-values are the same.

library(dyplr)
res <- dat %>% group_by(species) %>%
do(w=wilcox.test(dat$SR,mu=.5,alternative=("two.sided"))) %>%
summarize(species,wilcox=w$p.value)
res
#OUTPUT#
# # A tibble: 5 x 2
  species wilcox
  <chr>    <dbl>
1 A        0.465
2 B        0.465
3 C        0.465
4 D        0.465
5 E        0.465

Any idea what I'm doing wrong and how I can fix this?

Upvotes: 2

Views: 131

Answers (1)

Noah
Noah

Reputation: 440

The function do() is superseded and should not be used anymore. You can do the same within summarize() with across().

First you just group by species then you use across() within summarize() to access the values for each group and calculate the wilcoxon test and directly extract its p-value with $p.value at the end of the expression.

Mind that I set exact = FALSE to prevent the calculation of exact p-values as the sample is to small and it otherwise generates a warning. For your real data you can exclude this statement if your data sample is larger. For more information see this information.

n<-100
dat<-data.frame(species=rep(LETTERS[1:5],n/5), SR=sample((1:100)/100,n,replace=TRUE))

library(dplyr)

dat %>% 
  group_by(species) %>%
  summarize(wilcox = across(SR, 
                            ~wilcox.test(., 
                                         mu=.5, 
                                         alternative=("two.sided"),
                                         exact = FALSE)$p.value)$SR)
#> # A tibble: 5 × 2
#>   species wilcox$SR
#>   <chr>       <dbl>
#> 1 A          0.737 
#> 2 B          0.0105
#> 3 C          0.751 
#> 4 D          0.380 
#> 5 E          0.614

Created on 2022-08-19 with reprex v2.0.2

Upvotes: 0

Related Questions