Heisenberg
Heisenberg

Reputation: 8806

How to apply the same command to a list of variables

I want to apply t-tests on a bunch of variables. Below is some mock data

d <- data.frame(var1=rnorm(10), 
                var2=rnorm(10), 
                group=sample(c(0,1), 10, replace=TRUE))

# Is there a way to do this in some sort of loop?
with(d, t.test(var1~group))
with(d, t.test(var2~group))

# I tried this but the loop did not give a result!?
varnames <- c('var1', 'var2')
for (i in 1:2) {
  eval(substitute(with(d, t.test(variable~group)),
                  list(variable=as.name(varnames[i]))))  
}

Also, is it possible to extract the values from the t-test's result (e.g. two group means, p-value) so that the loop will produce a neat balance table across the variables? In other words, the end result I want is not a bunch of t-tests upon one another, but a table like this:

Varname   mean1   mean2   p-value
Var1        1.1    1.2     0.989
Var2        1.2    1.3     0.912

Upvotes: 4

Views: 1754

Answers (3)

dickoa
dickoa

Reputation: 18437

You can use formula and lapply like this

set.seed(1)
d <- data.frame(var1 = rnorm(10), 
                var2 = rnorm(10), 
                group = sample(c(0, 1), 10, replace = TRUE))


varnames <- c("var1", "var2")
formulas <- paste(varnames, "group", sep = " ~ ")
res <- lapply(formulas, function(f) t.test(as.formula(f), data = d))
names(res) <- varnames

If you want to extract your table, you can proceed like this

t(sapply(res, function(x) c(x$estimate, pval = x$p.value)))
     mean in group 0 mean in group 1     pval
var1         0.61288        0.012034 0.098055
var2         0.46382        0.195100 0.702365

Upvotes: 6

shadow
shadow

Reputation: 22293

Use sapply to apply t-test to all varnames and extract the necessary data by subsetting "estimate" and "p.value". Check names(with(d, t.test(var1~group))) if you want to extract other information

t(with(d, sapply(varnames, function(x) unlist(t.test(get(x)~group)[c("estimate", "p.value")]))))

Upvotes: 0

EDi
EDi

Reputation: 13280

Here is a reshape/plyr solution: The foo function is the workhorse, it runs the t-test and extract means and p-value.

d <- data.frame(var1=rnorm(10), 
                var2=rnorm(10), 
                group=sample(c(0,1), 10, replace=TRUE))

require(reshape2)
require(plyr)

dfm <- melt(d, id = 'group')

foo <- function(x) {
  tt <- t.test(value ~ group, data = x)
  out <- data.frame(mean1 = tt$estimate[1], mean2 = tt$estimate[2], P = tt$p.value)
  return(out)
}

ddply(dfm, .(variable), .fun=foo)
#  variable      mean1      mean2         P
#1     var1 -0.2641942  0.3716034 0.4049852
#2     var2 -0.9186919 -0.2749101 0.5949053

Upvotes: 3

Related Questions