dlply syntax with t.test

Question

I clearly still don't understand plyr syntax, as illustrated below. Can someone help me see what I'm missing?

The following code works fine, as expected:

# make a data frame to use dlply on 
f <- as.factor(c(rep("a", 3), rep("b", 3)))
y <- rnorm(6)
df <- data.frame(f=f, y=y)

# split the data frame by the factor and perform t-tests
l <- dlply(df, .(f), function(d) t.test(y, mu=0))

However, the following causes an error

l_bad <- dlply(df, .(f), t.test, .mu=0)
Error in if (stderr < 10 * .Machine$double.eps * abs(mx)) stop("data are essentially constant") : missing value where TRUE/FALSE needed

Which looks a bit as if R is trying to perform a t.test on the factor. Why would that be? Many thanks.

joran · Accepted Answer

dlply splits df into several data frames. That means that whatever function you hand off to dply must expect a data frame as input. t.test expects a vector as it's first argument.

Your anonymous function in dlply declares d as its only argument. But then in your call to t.test you pass only y. R doesn't automatically know to look in the data frame d for y. So instead it's probably finding the y that you defined in the global environment.

Simply changing that to t.test(d$y,mu = 0) in your first example should make it work.

The second example will only work if the function to be applied is expecting a data frame as input. (i.e. see summarise or transform.)

dlply syntax with t.test

Answers (1)

Related Questions