James Picerno
James Picerno

Reputation: 490

How to calculate means across rows of three multi-column dataframes?

Let's say I have 3 data frames, each a 5x5 object as such:

set.seed(1)
x <-as.data.frame(matrix(rnorm(10),ncol=5,nrow=5))
colnames(x) <-c("a","b","c","d","e")

y <-as.data.frame(matrix(rnorm(10),ncol=5,nrow=5))
colnames(y) <-c("f","g","h","i","j")

z <-as.data.frame(matrix(rnorm(10),ncol=5,nrow=5))
colnames(z) <-c("k","l","m","n","o")

So, x, for instance, looks like:

  > x
           a          b          c          d          e
1 -0.6264538 -0.8204684 -0.6264538 -0.8204684 -0.6264538
2  0.1836433  0.4874291  0.1836433  0.4874291  0.1836433
3 -0.8356286  0.7383247 -0.8356286  0.7383247 -0.8356286
4  1.5952808  0.5757814  1.5952808  0.5757814  1.5952808
5  0.3295078 -0.3053884  0.3295078 -0.3053884  0.3295078

How can I efficiently calculate the means across rows for the 3 values in the same spot in each data frame? That is, calculate the mean for the 3 values in row 1/col 1 over the data frames, for instance. Easy to do manually, of course. For instance:

> mean(c(x$a[1],y$f[1],z$k[1]))
[1] 0.6014349

> mean(c(x$b[1],y$g[1],z$l[1]))
[1] -0.3071769

... and so on. But how can I do this efficiently in R for much larger data frames? I've tried mapply() and variations on apply() and sweep(), but no luck. I know there's a simple solution but I'm having brain-lock. Any help would be greatly appreciated!

Upvotes: 0

Views: 98

Answers (4)

Robert Hijmans
Robert Hijmans

Reputation: 47191

another apporach:

rowMeans(sapply(list(x, y, z), function(x) unlist(x, use.names=FALSE)))

or, to get the 5x5 structure back and with the faster .rowMeans

rc <- dim(x)
d <- list(x, y, z)
r <- .rowMeans(sapply(d, function(x) unlist(x, use.names=FALSE)), prod(rc), length(d))
m <- matrix(r, nrow=rc[1])

Upvotes: 1

thelatemail
thelatemail

Reputation: 93843

Here is one way to generalise it while maintaining the matrix output:

apply(sapply(list(x,y,z), as.matrix, simplify="array"), 1:2, mean)
#              a           b          c           d          e
#[1,]  0.6014349 -0.30717691  0.6014349 -0.30717691  0.6014349
#[2,]  0.4518743  0.10514776  0.4518743  0.10514776  0.4518743
#[3,] -0.4607681  0.07046951 -0.4607681  0.07046951 -0.4607681
#[4,] -0.8695903  0.30628416 -0.8695903  0.30628416 -0.8695903
#[5,]  0.6914215  0.23548483  0.6914215  0.23548483  0.6914215

Upvotes: 1

EbrahimA
EbrahimA

Reputation: 59

You can convert the data frames to matrices, calculate the mean, and convert the mean matrix back to a data frame format. Here is the code:

xx <- data.matrix(x)
yy <- data.matrix(y)
zz <- data.matrix(z)
mm <- (xx+yy+zz)/3
mean.df <- data.frame(mm)

Upvotes: 1

Martin Schmelzer
Martin Schmelzer

Reputation: 23899

I feel like I have to supply my trivial solution as an answer...

(x+y+z)/3

Upvotes: 4

Related Questions