Reputation: 113
Let's consider the following vectors in the dataframe:
ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)
x <- data.frame(data=c(ctrl,x1,x2,x3),
Group=c(
rep("ctrl", length(ctrl)),
rep("x1", length(x1)),
rep("x2", length(x2)),
rep("x3", length(x3))) )
I know I could use
pairwise.t.test(x$data,
x$Group,
pool.sd=FALSE)
to get pairwise comparison like
Pairwise comparisons using t tests with non-pooled SD
data: x$data and x$Group
ctrl x1 x2
x1 0.08522 - -
x2 0.99678 0.10469 -
x3 0.00065 0.99678 2.8e-05
P value adjustment method: holm
However I am not interested in every possible combination of vectors. I am seeking a way to compare ctrl vector with every other vectors, and to take into account alpha inflation. I'd like to avoid
t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x1']), var.equal=T)
t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x2']), var.equal=T)
t.test((x$data[x$Group=='ctrl']), (x$data[x$Group=='x3']), var.equal=T)
And then perform manual correction for multiple comparisons. What would be the best way to do so ?
Upvotes: 1
Views: 1878
Reputation: 16121
@BondedDust 's answer looks great. I provide a bit more complicated solution if you really need to work with dataframes.
library(dplyr)
ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)
x <- data.frame(data=c(ctrl,x1,x2,x3),
Group=c(
rep("ctrl", length(ctrl)),
rep("x1", length(x1)),
rep("x2", length(x2)),
rep("x3", length(x3))), stringsAsFactors = F )
# provide the combinations you want
# set1 with all from set2
set1 = c("ctrl")
set2 = c("x1","x2","x3")
dt_res =
data.frame(expand.grid(set1,set2)) %>% # create combinations
mutate(test_id = row_number()) %>% # create a test id
group_by(test_id) %>% # group by test id, so everything from now on is performed for each test separately
do({x_temp = x[(x$Group==.$Var1 | x$Group==.$Var2),] # for each test id keep groups of interest
x_temp = data.frame(x_temp)}) %>%
do(test = t.test(data~Group, data=.)) # perform the test and save it
# you create a dataset that has the test id and a column with t.tests results as elements
dt_res
# Source: local data frame [3 x 2]
# Groups: <by row>
#
# test_id test
# 1 1 <S3:htest>
# 2 2 <S3:htest>
# 3 3 <S3:htest>
# get all tests as a list
dt_res$test
# [[1]]
#
# Welch Two Sample t-test
#
# data: data by Group
# t = -1.9776, df = 58.36, p-value = 0.05271
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -0.894829477 0.005371207
# sample estimates:
# mean in group ctrl mean in group x1
# -0.447213560 -0.002484425
#
#
# [[2]]
#
# Welch Two Sample t-test
#
# data: data by Group
# t = -2.3549, df = 100.68, p-value = 0.02047
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -0.71174095 -0.06087081
# sample estimates:
# mean in group ctrl mean in group x2
# -0.44721356 -0.06090768
#
#
# [[3]]
#
# Welch Two Sample t-test
#
# data: data by Group
# t = -5.4235, df = 101.12, p-value = 4.001e-07
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -1.2171386 -0.5652189
# sample estimates:
# mean in group ctrl mean in group x3
# -0.4472136 0.4439652
PS : It's always interesting to work with p-values and alpha corrections. It's a bit of a philosophical issue how to approach that and some people agree and other disagree. Personally, I tend to correct alpha based on all possible comparison I can do after an experiment, because you never know when you'll come back to investigate other pairs. Imagine what happens if in the future people decide that you have to go back and compare the winning group (let's say x1) with x2 and x3. You'll focus on those pairs and you'll again correct alpha based on those compariosns. But on the whole you performed all possible comparisons, apart from x2 vs x3! You may write your reports or publish findings that should have been a bit more strict on the alpha correction.
Upvotes: 0
Reputation: 263481
You can use p.adjust to get a Bonferroni adjustment to multiple p-values. You should not bundle thos unequal length vectors inot t adataframe but rather use a list.
ctrl <- rnorm(50)
x1 <- rnorm(30, mean=0.2)
x2 <- rnorm(100,mean=0.1)
x3 <- rnorm(100,mean=0.4)
> lapply( list(x1,x2,x3), function(x) t.test(x,ctrl)$p.value)
[[1]]
[1] 0.2464039
[[2]]
[1] 0.8576423
[[3]]
[1] 0.0144275
> p.adjust( .Last.value)
[1] 0.4928077 0.8576423 0.0432825
Upvotes: 2