speeding up for loops in R

Question

This is less about asking for a new code but more about how R does such a calculation. Of course, I'll take any and all suggestions to increase it's efficiency.

Let's say I have a script like so:

x=matrix(complex(1:10,1:10,imaginary = 1:10),ncol=2)
y=x+300
raw=list(x,y)
raw_complex = list(raw,raw,raw,raw)

It's a list of complex matrices. I'm trying to get phase angle out of it which is: phase = atan(Im(x)/Re(x))

My current code is:

for (m in 1:length(raw_complex)){
  for (n in 1:length(raw_complex[[m]])){
    for (i in 1:dim(raw_complex[[m]][[n]])[1]){
      for (j in 1:dim(raw_complex[[m]][[n]])[2]){
 raw_complex[[m]][[n]][i,j]=(atan(Im(raw_complex[[m]][[n]][i,j])/Re(raw_complex[[m]][[n]][i,j])))
      }}}}

I know, I know, avoid for loops in R. But conceptually, this makes it easier for me to see what is happening vs. lapply or sapply.

My question is, on each iteration of the loop is R making a copy of the entire list or matrix in memory rather than pulling each individual element one at a time? Obviously I'd rather not have R make an entire copy every iteration.

My real data set has a list of 4, with 95 matrices in each element of the list. Each matrix is 145x901 so you can see how I'd like this to be as fast as possible.

Oh, and it would be nice if the output was a real number not a complex number. I've tried adding as.numeric() infront of the atan() but that didn't seem to help.

Thanks!

Andrie · Accepted Answer

Make use of the fact that R is vectorised. Specifically, this means that you can apply many computations directly on a vector or matrix.

For example, define a function for phase:

phase <- function(x)atan(Im(x)/Re(x))
phase(x)

With this single, simple function, you compute the phase for every cell in your matrix:

          [,1]      [,2]
[1,] 0.7853982 0.7853982
[2,] 0.7853982 0.7853982
[3,] 0.7853982 0.7853982
[4,] 0.7853982 0.7853982
[5,] 0.7853982 0.7853982

Now you're one step away from applying this phase function on your list. For this, you can use lapply():

lapply(raw, phase)
[[1]]
          [,1]      [,2]
[1,] 0.7853982 0.7853982
[2,] 0.7853982 0.7853982
[3,] 0.7853982 0.7853982
[4,] 0.7853982 0.7853982
[5,] 0.7853982 0.7853982

[[2]]
            [,1]       [,2]
[1,] 0.003322247 0.01960533
[2,] 0.006622420 0.02279735
[3,] 0.009900667 0.02596819
[4,] 0.013157135 0.02911798
[5,] 0.016391974 0.03224688

But what you're really after, is to apply this function recursively to your list of lists. For this, the function rapply() exists - where the r stands for recursive:

z <- rapply(raw_complex, phase, how = "list")
str(z)
List of 4
 $ :List of 2
  ..$ : num [1:5, 1:2] 0.785 0.785 0.785 0.785 0.785 ...
  ..$ : num [1:5, 1:2] 0.00332 0.00662 0.0099 0.01316 0.01639 ...
 $ :List of 2
  ..$ : num [1:5, 1:2] 0.785 0.785 0.785 0.785 0.785 ...
  ..$ : num [1:5, 1:2] 0.00332 0.00662 0.0099 0.01316 0.01639 ...
 $ :List of 2
  ..$ : num [1:5, 1:2] 0.785 0.785 0.785 0.785 0.785 ...
  ..$ : num [1:5, 1:2] 0.00332 0.00662 0.0099 0.01316 0.01639 ...
 $ :List of 2
  ..$ : num [1:5, 1:2] 0.785 0.785 0.785 0.785 0.785 ...
  ..$ : num [1:5, 1:2] 0.00332 0.00662 0.0099 0.01316 0.01639 ...

This will be fast - not because you avoided the loop - but because you compute on the matrix, rather than every cell.

More importantly, it's concise and easy to read.

speeding up for loops in R

Answers (1)

Related Questions