Why is this for loop code as fast as apply()?

Question

EDITED to provide re-producible results

It wasn't clear initially, but I need the results to take into account NAs in the raw data (df)

#

I initially wrote my code with for loops to get the proof of concept working, but now I need to speed things up. It runs in ~2-3 minutes with the for loops, but when I re-wrote it using apply(), it wasn't any faster. I thought apply() should be the vectorized solution and therefore faster, but maybe my whole premise is incorrect? (I am not new to R, but computational speed is not usually an issue for me.)

I am working with 1000+ cases and ~100 variables and need to perform 5000+ simulations with the data (turning on and off different conditions).

Starting definitions & sample data:

cases = 1000
variables = 100
simulations = 5000

df <- as.data.frame(array(rnorm(cases * variables, 0, 5), dim=c(cases, variables)))

montecarlo <- matrix(rbinom(simulations * variables, 1, 20/variables), simulations, variables)
montecarlo[montecarlo==0] <- NA

calc <- array(,dim=c(cases, variables, simulations))
interim <- array(,dim=c(cases, variables, simulations))
results <- array(,dim=c(variables, simulations))

for (j in 1:simulations) {
  calc[,,j] <- exp(t(t(df) * as.numeric(montecarlo[j,])))
}

For loop version:

for (j in 1:simulations) {
  interim[,,j] <- t(apply(calc[,,j], 1, function(x) x/sum(x, na.rm = TRUE))) # re-share
  results[,j] <- apply(interim[,,j], 2, sum) # aggregates results
}

Apply() version:

interim <- apply(calc, c(1,3), function(x) x/sum(x, na.rm = TRUE)) # re-share
results <- as.data.frame(t(apply(interim, c(1,3), sum))) # aggregates results

I am open to any suggestions to speed things up and/or a reason why the apply() version is not any faster. Thank you!

Cole · Accepted Answer

As @Mikko Marttila noted, the apply() family does not guarantee faster code. Using sweep and aperm(), the code below is about 3x faster for a 1,000 x 100 x 70 array (i.e., only 7 million elements).

results4 <- colSums(sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/'), na.rm = T)

Or, for slightly less performance but is more similar to what you originally had:

interim3 <- sweep(calc, c(1,3), apply(calc, 3, rowSums, na.rm = T), FUN = '/')
results3 <- apply(interim3, c(2,3), sum, na.rm = T)

Performance:

Unit: milliseconds
          expr      min       lq     mean   median       uq      max neval
      for_loop 510.9131 514.9030 537.0344 518.2491 524.5709 705.4087    10
      apply_OP 446.0352 458.4940 491.6710 500.1995 523.1843 533.9654    10
 sweep_rowSums 225.5855 233.2632 252.6149 240.7245 284.1517 292.3476    10
   sweep_aperm 136.2519 140.8912 163.7498 154.6984 191.5337 217.8015    10

Data

cases = 1000
variables = 100
simulations = 70

set.seed(123)
calc <- array(sample(cases *variables * simulations),dim=c(cases, variables, simulations))

interim <- array(,dim=c(cases, variables, simulations))
results <- array(,dim=c(variables, simulations))

# Original Loop
for (j in seq_len(simulations)) {
  interim[,,j] <- t(apply(calc[,,j], 1, function(x) x/sum(x, na.rm = TRUE))) # re-share
  results[,j] <- apply(interim[,,j], 2, sum) # aggregates results
}

# original apply

interim2 <- apply(calc, c(1,3), function(x) x/sum(x, na.rm = TRUE)) # re-share
results2 <- apply(interim2, c(1,3), sum) # aggregates results

# using sweep

interim3 <- sweep(calc, c(1,3), apply(calc, 3, rowSums, na.rm = T), FUN = '/')
results3 <- apply(interim3, c(2,3), sum, na.rm = T)

#using sweep and aperm
# interim4 <- sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/')
results4 <- colSums(sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/'), na.rm = T)

all.equal(results4, results3, results2, results)



library(microbenchmark)

microbenchmark(
  for_loop = {
    for (j in seq_len(simulations)) {
      interim[,,j] <- t(apply(calc[,,j], 1, function(x) x/sum(x, na.rm = TRUE))) # re-share
      results[,j] <- apply(interim[,,j], 2, sum) # aggregates results
    }
  }
  ,
  apply_OP = {
    interim2 <- apply(calc, c(1,3), function(x) x/sum(x, na.rm = TRUE)) # re-share
    results2 <- apply(interim2, c(1,3), sum) # aggregates results
  }
  ,
  sweep_rowSums = {
    interim3 <- sweep(calc, c(1,3), apply(calc, 3, rowSums, na.rm = T), FUN = '/')
    results3 <- apply(interim3, c(2,3), sum, na.rm = T)
  }
  ,
  sweep_aperm = {
    results4 <- colSums(sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/'), na.rm = T)
  }
  , times = 10
)

Why is this for loop code as fast as apply()?

#

Answers (2)

Related Questions