Nicholas Humphrey
Nicholas Humphrey

Reputation: 1250

How to vectorize triple nested loops?

I've done searching similar problems and I have a vague idea about what should I do: to vectorize everything or use apply() family. But I'm a beginner on R programming and both of the above methods are quite confusing.

Here is my source code:

x<-rlnorm(100,0,1.6)
j=0
k=0
i=0
h=0
lambda<-rep(0,200)
sum1<-rep(0,200)
constjk=0
wj=0
wk=0
for (h in 1:200)
{
   lambda[h]=2+h/12.5
   N=ceiling(lambda[h]*max(x))
   for (j in 0:N)
   {
      wj=(sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100
      for (k in 0:N)
      {
         constjk=dbinom(k, j + k, 0.5)
         wk=(sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100
         sum1[h]=sum1[h]+(lambda[h]/2)*constjk*wk*wj
      }
   }
}

Let me explain a bit. I want to collect 200 sum1 values (that's the first loop), and for every sum1 value, it is the summation of (lambda[h]/2)*constjk*wk*wj, thus the other two loops. Most tedious is that N changes with h, so I have no idea how to vectorize the j-loop and the k-loop. But of course I can vectorize the h-loop with lambda<-seq() and N<-ceiling(), and that's the best I can do. Is there a way to further simplify the code?

Upvotes: 7

Views: 2467

Answers (2)

Roland
Roland

Reputation: 132706

Let`s wrap your simulation in a function and time it:

sim1 <- function(num=20){
  set.seed(42)
  x<-rlnorm(100,0,1.6)
  j=0
  k=0
  i=0
  h=0
  lambda<-rep(0,num)
  sum1<-rep(0,num)
  constjk=0
  wj=0
  wk=0

  for (h in 1:num)
  {
    lambda[h]=2+h/12.5
    N=ceiling(lambda[h]*max(x))
    for (j in 0:N)
    {
      wj=(sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100
      for (k in 0:N)
      {
        set.seed(42)
        constjk=dbinom(k, j + k, 0.5)
        wk=(sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100
        sum1[h]=sum1[h]+(lambda[h]/2)*constjk*wk*wj
      }
    }
  }

  sum1
}

system.time(res1 <- sim1())
#   user  system elapsed 
#    5.4     0.0     5.4

Now let's make it faster:

sim2 <- function(num=20){
  set.seed(42) #to make it reproducible
  x <- rlnorm(100,0,1.6)

  h <- 1:num
  sum1 <- numeric(num)
  lambda <- 2+1:num/12.5
  N <- ceiling(lambda*max(x))

  #functions for wj and wk
  wjfun <- function(x,j,lambda,h){
    (sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100
  }
  wkfun <- function(x,k,lambda,h){
    (sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100
  }

  #function to calculate values of sum1
  fun1 <- function(N,h,x,lambda) {
    sum1 <- 0
    set.seed(42) #to make it reproducible
    #calculate constants using outer
    const <- outer(0:N[h],0:N[h],FUN=function(j,k) dbinom(k, j + k, 0.5))
    wk <- numeric(N[h]+1)
    #loop only once to calculate wk
    for (k in 0:N[h]){
      wk[k+1] <- (sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100 
    }

    for (j in 0:N[h])
    {
      wj <- (sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100
      for (k in 0:N[h])
      {
        sum1 <- sum1+(lambda[h]/2)*const[j+1,k+1]*wk[k+1]*wj
      }
    }
    sum1
  }

  for (h in 1:num)
  {
    sum1[h] <- fun1(N,h,x,lambda)
  }  
  sum1
}

system.time(res2 <- sim2())
#user  system elapsed 
#1.25    0.00    1.25 

all.equal(res1,res2)
#[1] TRUE

Timings for @Backlin`s code (with 20 interations) for comparison:

   user  system elapsed 
   3.30    0.00    3.29 

If this is still too slow and you cannot or don't want to use another language, there is also the possibility of parallelization. As far as I see the outer loop is embarrassingly parallel. There are some nice and easy packages for parallelization.

Upvotes: 2

Backlin
Backlin

Reputation: 14842

Your code can be perfectly verctorized with 3 nested sapply calls. It might be a bit hard to read for the untrained eye, but the essence of it is that instead of adding one value at a time to sum1[h] we calculate all the terms produced by the innermost loop in one go and sum them up.

Although this vectorized solution is faster than your tripple for loop, the improvement is not dramatical. If you plan to use it many times I suggest you implement it in C or Fortran (with regular for loops), which improves the speed a lot. Beware though that it has high time complexity and will scale badly with increased values of lambda, ultimatelly reaching a point when it is not possible to compute within reasonable time regardless of the implementation.

lambda <- 2 + 1:200/12.5
sum1 <- sapply(lambda, function(l){
    N <- ceiling(l*max(x))
    sum(sapply(0:N, function(j){
        wj <- (sum(x <= (j+1)/l) - sum(x <= j/l))/100
        sum(sapply(0:N, function(k){
            constjk <- dbinom(k, j + k, 0.5)
            wk <- (sum(x <= (k+1)/l) - sum(x <= k/l))/100
            l/2*constjk*wk*wj
        }))
    }))
})

Btw, you don't need to predefine variables like h, j, k, wj and wk. Especially since not when vectorizing, as assignments to them inside the functions fed to sapply will create overlayered local variables with the same name (i.e. ignoring the ones you predefied).

Upvotes: 5

Related Questions