mapply with arguments as two lists (one formulas, one vectors)

Question

I'm trying to use mapply to apply t.test over two lists of arguments. The first list formulas contains three formulas and the second list periods contains three vectors that subset my.data, which I pass with the MoreArgs argument.

I can do the t.tests manually with a for loop (also below), but I can't figure out why my mapply use fails. Is this not the correct time to use mapply?

# similar data
my.data <- data.frame(CAR1=rnorm(150),
                      CAR2=rnorm(150),
                      CAR3=rnorm(150),
                      period=rep(1:3, each=50),
                      treated=rep(1:2, times=75)
                      )

# two lists to pass as arguments to `t.test()`
# `subset`
periods <- list(my.data$period == 1,
                my.data$period <= 2,
                my.data$period <= 3
                )
# `formula`
formulas <- list(CAR1 ~ treated,
                 CAR2 ~ treated,
                 CAR3 ~ treated
                 )

# manual solution works
ttests <- list()
for (i in 1:3) {
    ttests[[i]] <- t.test(formulas[[i]], 
                          data=my.data, 
                          subset=periods[[i]]
                          )
}

# but `mapply` fails
ttest <- mapply(FUN=t.test, 
                formula=formulas, 
                subset=periods, 
                MoreArgs=list(data=my.data),
                SIMPLIFY=FALSE
                )

# with error "Error in eval(expr, envir, enclos) : object 'dots' not found"

Roman Luštrik · Accepted Answer

If you split your data.frame according to period, you don't need the periods object.

split.my.data <- split(my.data, f = my.data$period)

mapply(FUN = function(x, y) {
  t.test(x, data = y)  
}, x = formulas, y = split.my.data, SIMPLIFY = FALSE)

[[1]]

    Welch Two Sample t-test

data:  CAR1 by treated
t = -0.7051, df = 44.861, p-value = 0.4844
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.9277752  0.4466579
sample estimates:
mean in group 1 mean in group 2 
      0.1650074       0.4055661 


[[2]]
... # output truncated

EDIT

In the case where you want to subset factors based on a logical operator other than ==, I would create a "split list" like so.

split.my.data <- sapply(periods, FUN = function(x, my.data) my.data[x, ], 
       my.data = my.data, simplify = FALSE)

mapply with arguments as two lists (one formulas, one vectors)

Answers (1)

Related Questions