R function applied on data frame grouped by multiple factors

Question

I have a data frame called subdata, with a dimension of 10299 x 81. Column 1 called "Subject" and column 2 called "Activity". I want to calculate the average of each column grouped by "Subject" and "Activity".

Here are the functions I tried and none of them seems work so far. Finally I used colwise(mean) function, it seems work. I am new to R and just learned sapply, lapply, tapply functions and it seems mean function works in columns.

Can anyone help me explain what does these error or warning message mean and if there a way to make theses functions work?

Use lapply function:

newdata<- subdata[, lapply(.SD, mean), by = c("Subject","Activity")]

The error message:

Error in `[.data.frame`(subdata, , lapply(.SD, mean), by = c("Subject",  : 
unused argument (by = c("Subject", "Activity"))

Use by function:

newdata<-by(subdata, list(subdata$Subject, subdata$Activity), mean)

I got warning message:

Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
   argument is not numeric or logical: returning NA

Then I tried ddply in plyr package

ddply(subdata, .(Subject, Activity), mean)

I got the same warning message:

Warning messages:
1: In mean.default(piece, ...) : argument is not numeric or logical: returning NA 0

Finally I used colwise(mean)function, it seems work

newdata<-ddply(subdata, .(Subject, Activity), colwise(mean))

R function applied on data frame grouped by multiple factors

Answers (1)

Related Questions