Cobactan
Cobactan

Reputation: 113

How can I perform multiple pairwise t.test in R using the same reference vector?

Let's consider the following vectors in the dataframe:

ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)

x <- data.frame(data=c(ctrl,x1,x2,x3),
            Group=c(
              rep("ctrl", length(ctrl)),
              rep("x1", length(x1)),
              rep("x2", length(x2)),
              rep("x3", length(x3))) )

I know I could use

pairwise.t.test(x$data,
            x$Group,
            pool.sd=FALSE)

to get pairwise comparison like

 Pairwise comparisons using t tests with non-pooled SD 

data:  x$data and x$Group 

   ctrl    x1      x2     
x1 0.08522 -       -      
x2 0.99678 0.10469 -      
x3 0.00065 0.99678 2.8e-05

P value adjustment method: holm 

However I am not interested in every possible combination of vectors. I am seeking a way to compare ctrl vector with every other vectors, and to take into account alpha inflation. I'd like to avoid

t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x1']), var.equal=T)
t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x2']), var.equal=T)
t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x3']), var.equal=T)

And then perform manual correction for multiple comparisons. What would be the best way to do so ?

Upvotes: 1

Views: 1878

Answers (2)

AntoniosK
AntoniosK

Reputation: 16121

@BondedDust 's answer looks great. I provide a bit more complicated solution if you really need to work with dataframes.

library(dplyr)

ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)

x <- data.frame(data=c(ctrl,x1,x2,x3),
                Group=c(
                  rep("ctrl", length(ctrl)),
                  rep("x1", length(x1)),
                  rep("x2", length(x2)),
                  rep("x3", length(x3))), stringsAsFactors = F )

# provide the combinations you want
# set1 with all from set2
set1 = c("ctrl")
set2 = c("x1","x2","x3")

dt_res =
      data.frame(expand.grid(set1,set2)) %>%  # create combinations 
        mutate(test_id = row_number()) %>%    # create a test id
        group_by(test_id) %>%              # group by test id, so everything from now on is performed for each test separately
        do({x_temp = x[(x$Group==.$Var1 | x$Group==.$Var2),]    # for each test id keep groups of interest
            x_temp = data.frame(x_temp)}) %>%
        do(test = t.test(data~Group, data=.))     # perform the test and save it

# you create a dataset that has the test id and a column with t.tests results as elements
dt_res

# Source: local data frame [3 x 2]
# Groups: <by row>
#   
#     test_id       test
#   1       1 <S3:htest>
#   2       2 <S3:htest>
#   3       3 <S3:htest>


# get all tests as a list
dt_res$test

# [[1]]
# 
# Welch Two Sample t-test
# 
# data:  data by Group
# t = -1.9776, df = 58.36, p-value = 0.05271
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#   -0.894829477  0.005371207
# sample estimates:
#   mean in group ctrl   mean in group x1 
# -0.447213560       -0.002484425 
# 
# 
# [[2]]
# 
# Welch Two Sample t-test
# 
# data:  data by Group
# t = -2.3549, df = 100.68, p-value = 0.02047
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#   -0.71174095 -0.06087081
# sample estimates:
#   mean in group ctrl   mean in group x2 
# -0.44721356        -0.06090768 
# 
# 
# [[3]]
# 
# Welch Two Sample t-test
# 
# data:  data by Group
# t = -5.4235, df = 101.12, p-value = 4.001e-07
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
#   -1.2171386 -0.5652189
# sample estimates:
#   mean in group ctrl   mean in group x3 
# -0.4472136          0.4439652

PS : It's always interesting to work with p-values and alpha corrections. It's a bit of a philosophical issue how to approach that and some people agree and other disagree. Personally, I tend to correct alpha based on all possible comparison I can do after an experiment, because you never know when you'll come back to investigate other pairs. Imagine what happens if in the future people decide that you have to go back and compare the winning group (let's say x1) with x2 and x3. You'll focus on those pairs and you'll again correct alpha based on those compariosns. But on the whole you performed all possible comparisons, apart from x2 vs x3! You may write your reports or publish findings that should have been a bit more strict on the alpha correction.

Upvotes: 0

IRTFM
IRTFM

Reputation: 263481

You can use p.adjust to get a Bonferroni adjustment to multiple p-values. You should not bundle thos unequal length vectors inot t adataframe but rather use a list.

ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)

> lapply( list(x1,x2,x3), function(x) t.test(x,ctrl)$p.value)
[[1]]
[1] 0.2464039

[[2]]
[1] 0.8576423

[[3]]
[1] 0.0144275

> p.adjust( .Last.value)
[1] 0.4928077 0.8576423 0.0432825

Upvotes: 2

Related Questions