MNU
MNU

Reputation: 764

Reducing computation time for matrix operation in for loop in R

For the following two matrices, I would like to find the mean for each column. It is easy to find for a small number of row and a small number of column.

  yy=matrix(c(1:40), nrow = 10, ncol = 4)
    tt=c(1:8)
    yy_new=matrix(NA, nrow = 10, ncol=length(tt))
    yy_new1=matrix(NA, nrow = 10, ncol=length(tt))
    dim(yy_new)
    for ( it in 1:10){
      for ( tim in 1:8){
        yy_new[it, tim]=yy[it,1]+yy[it,3]*tt[tim]
        yy_new1[it, tim]=yy[it,2]+yy[it,4]*tt[tim]+2
      }
    }
yy_new_mean=apply(yy_new,2,mean) #column wise mean of the first matrix 

yy_new1_mean=apply(yy_new1,2,mean)

If the number of column and rows are very large say 10000 rows and 2,000 columns, It is taking too much time to create the matrix which is in the inside loop (yy_new and yy_new1). Can I do do it efficiently so that the computation will not take a long time?

Upvotes: 0

Views: 124

Answers (1)

Leo Barlach
Leo Barlach

Reputation: 490

You can use the function outer to create matrices of the results you want:

yy_new <- outer(1:10, 1:8, function(x,y){
  yy[x,1]+yy[x,3]*tt[y]
})

yy_new1 <- outer(1:10, 1:8, function(x,y){
  yy[x,2]+yy[x,4]*tt[y]+2
})

That's much faster than a for loop. In general in R you want to avoid for loops, as most functions are vectorized.

Comparing both options using microbenchmark, it's about 100 times faster:

  min       lq       mean   median       uq      max neval
 6207.115 6601.342 7691.66462 6868.801 7215.776 45110.99   100
   27.152   30.855   50.98553   56.066   61.532   195.35   100

Upvotes: 1

Related Questions