Darwin PC
Darwin PC

Reputation: 931

How to calculate an overall mean from more than two columns in a data frame?

I would like to have a single mean value from my selected columns in a data frame, but it doesn't works from two columns. I tried this:

testDF <- data.frame(v1 = c(1,3,15,7,18,3,5,NA,4,5,7,9),
                     v2 = c(11,33,55,7,88,33,55,NA,44,5,67,99),
                     v3 = c(NA,33,5,77,88,3,55,NA,4,55,87,14))

mean(testDF[,2:3], na.rm=T)

and I get this Warning message:

mean(testDF[,2:3], na.rm=T)
[1] NA
Warning message:
In mean.default(testDF[, 2:3], na.rm = T) :
argument is not numeric or logical: returning NA

if I use the sum() function it works perfectly, but I don't understand why it can't works with the mean() function. After some steps I did it with the melt() function from the reshape2{} package but I'm looking a short way to do it simple because I have a lot of variables and data.

Regards

Upvotes: 1

Views: 1597

Answers (1)

nico
nico

Reputation: 51640

The help for mean says:

Currently there are methods for numeric/logical vectors and date, date-time and time interval objects.

which makes me think that mean does not work on data frames.

Indeed you will see that doing mean(testDF) results in the same error, but mean(testDF[,1]) works.

The easiest solution is to do:

mean(as.matrix(testDF[,2:3]), na.rm=T)

Also, you can use colMeans to get the mean of each column.

Indeed, if you look at the source for colMeans, the first lines are:

if (is.data.frame(x)) 
    x <- as.matrix(x)

Upvotes: 4

Related Questions