Richard Herron
Richard Herron

Reputation: 10112

mapply with arguments as two lists (one formulas, one vectors)

I'm trying to use mapply to apply t.test over two lists of arguments. The first list formulas contains three formulas and the second list periods contains three vectors that subset my.data, which I pass with the MoreArgs argument.

I can do the t.tests manually with a for loop (also below), but I can't figure out why my mapply use fails. Is this not the correct time to use mapply?

# similar data
my.data <- data.frame(CAR1=rnorm(150),
                      CAR2=rnorm(150),
                      CAR3=rnorm(150),
                      period=rep(1:3, each=50),
                      treated=rep(1:2, times=75)
                      )

# two lists to pass as arguments to `t.test()`
# `subset`
periods <- list(my.data$period == 1,
                my.data$period <= 2,
                my.data$period <= 3
                )
# `formula`
formulas <- list(CAR1 ~ treated,
                 CAR2 ~ treated,
                 CAR3 ~ treated
                 )

# manual solution works
ttests <- list()
for (i in 1:3) {
    ttests[[i]] <- t.test(formulas[[i]], 
                          data=my.data, 
                          subset=periods[[i]]
                          )
}

# but `mapply` fails
ttest <- mapply(FUN=t.test, 
                formula=formulas, 
                subset=periods, 
                MoreArgs=list(data=my.data),
                SIMPLIFY=FALSE
                )

# with error "Error in eval(expr, envir, enclos) : object 'dots' not found"

Upvotes: 0

Views: 257

Answers (1)

Roman Luštrik
Roman Luštrik

Reputation: 70653

If you split your data.frame according to period, you don't need the periods object.

split.my.data <- split(my.data, f = my.data$period)

mapply(FUN = function(x, y) {
  t.test(x, data = y)  
}, x = formulas, y = split.my.data, SIMPLIFY = FALSE)

[[1]]

    Welch Two Sample t-test

data:  CAR1 by treated
t = -0.7051, df = 44.861, p-value = 0.4844
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.9277752  0.4466579
sample estimates:
mean in group 1 mean in group 2 
      0.1650074       0.4055661 


[[2]]
... # output truncated

EDIT

In the case where you want to subset factors based on a logical operator other than ==, I would create a "split list" like so.

split.my.data <- sapply(periods, FUN = function(x, my.data) my.data[x, ], 
       my.data = my.data, simplify = FALSE)

Upvotes: 1

Related Questions