Reputation: 100
EDITED to provide re-producible results
It wasn't clear initially, but I need the results to take into account NAs in the raw data (df)
I initially wrote my code with for loops to get the proof of concept working, but now I need to speed things up. It runs in ~2-3 minutes with the for loops, but when I re-wrote it using apply(), it wasn't any faster. I thought apply() should be the vectorized solution and therefore faster, but maybe my whole premise is incorrect? (I am not new to R, but computational speed is not usually an issue for me.)
I am working with 1000+ cases and ~100 variables and need to perform 5000+ simulations with the data (turning on and off different conditions).
Starting definitions & sample data:
cases = 1000
variables = 100
simulations = 5000
df <- as.data.frame(array(rnorm(cases * variables, 0, 5), dim=c(cases, variables)))
montecarlo <- matrix(rbinom(simulations * variables, 1, 20/variables), simulations, variables)
montecarlo[montecarlo==0] <- NA
calc <- array(,dim=c(cases, variables, simulations))
interim <- array(,dim=c(cases, variables, simulations))
results <- array(,dim=c(variables, simulations))
for (j in 1:simulations) {
calc[,,j] <- exp(t(t(df) * as.numeric(montecarlo[j,])))
}
For loop version:
for (j in 1:simulations) {
interim[,,j] <- t(apply(calc[,,j], 1, function(x) x/sum(x, na.rm = TRUE))) # re-share
results[,j] <- apply(interim[,,j], 2, sum) # aggregates results
}
Apply() version:
interim <- apply(calc, c(1,3), function(x) x/sum(x, na.rm = TRUE)) # re-share
results <- as.data.frame(t(apply(interim, c(1,3), sum))) # aggregates results
I am open to any suggestions to speed things up and/or a reason why the apply() version is not any faster. Thank you!
Upvotes: 1
Views: 165
Reputation: 11255
As @Mikko Marttila noted, the apply()
family does not guarantee faster code. Using sweep
and aperm()
, the code below is about 3x faster for a 1,000 x 100 x 70 array (i.e., only 7 million elements).
results4 <- colSums(sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/'), na.rm = T)
Or, for slightly less performance but is more similar to what you originally had:
interim3 <- sweep(calc, c(1,3), apply(calc, 3, rowSums, na.rm = T), FUN = '/')
results3 <- apply(interim3, c(2,3), sum, na.rm = T)
Performance:
Unit: milliseconds
expr min lq mean median uq max neval
for_loop 510.9131 514.9030 537.0344 518.2491 524.5709 705.4087 10
apply_OP 446.0352 458.4940 491.6710 500.1995 523.1843 533.9654 10
sweep_rowSums 225.5855 233.2632 252.6149 240.7245 284.1517 292.3476 10
sweep_aperm 136.2519 140.8912 163.7498 154.6984 191.5337 217.8015 10
Data
cases = 1000
variables = 100
simulations = 70
set.seed(123)
calc <- array(sample(cases *variables * simulations),dim=c(cases, variables, simulations))
interim <- array(,dim=c(cases, variables, simulations))
results <- array(,dim=c(variables, simulations))
# Original Loop
for (j in seq_len(simulations)) {
interim[,,j] <- t(apply(calc[,,j], 1, function(x) x/sum(x, na.rm = TRUE))) # re-share
results[,j] <- apply(interim[,,j], 2, sum) # aggregates results
}
# original apply
interim2 <- apply(calc, c(1,3), function(x) x/sum(x, na.rm = TRUE)) # re-share
results2 <- apply(interim2, c(1,3), sum) # aggregates results
# using sweep
interim3 <- sweep(calc, c(1,3), apply(calc, 3, rowSums, na.rm = T), FUN = '/')
results3 <- apply(interim3, c(2,3), sum, na.rm = T)
#using sweep and aperm
# interim4 <- sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/')
results4 <- colSums(sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/'), na.rm = T)
all.equal(results4, results3, results2, results)
library(microbenchmark)
microbenchmark(
for_loop = {
for (j in seq_len(simulations)) {
interim[,,j] <- t(apply(calc[,,j], 1, function(x) x/sum(x, na.rm = TRUE))) # re-share
results[,j] <- apply(interim[,,j], 2, sum) # aggregates results
}
}
,
apply_OP = {
interim2 <- apply(calc, c(1,3), function(x) x/sum(x, na.rm = TRUE)) # re-share
results2 <- apply(interim2, c(1,3), sum) # aggregates results
}
,
sweep_rowSums = {
interim3 <- sweep(calc, c(1,3), apply(calc, 3, rowSums, na.rm = T), FUN = '/')
results3 <- apply(interim3, c(2,3), sum, na.rm = T)
}
,
sweep_aperm = {
results4 <- colSums(sweep(calc, c(1,3), colSums(aperm(calc, c(2,1,3)), na.rm = T), FUN = '/'), na.rm = T)
}
, times = 10
)
Upvotes: 1
Reputation: 11898
Generally speaking: for-loops aren’t inherently slow. If you’re preallocating output (i.e. not growing a vector, causing multiple copies), they’re pretty much comparable to *apply()
functions in speed. The overhead in iterating comes from repeatedly calling R functions from C code; and the advantage of using functionals is simply clarity. Here’s an example with a for-loop wrapped in a function:
foo <- function(x, f, ...) {
out <- vector("list", length(x))
for (i in seq_along(x)) {
out[[i]] <- f(x[[i]], ...)
}
out
}
x <- replicate(10000, rnorm(30), simplify = FALSE)
bench::mark(foo(x, mean), lapply(x, mean))
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 foo(x, mean) 36.9ms 38.4ms 26.1 157.3KB 52.3
#> 2 lapply(x, mean) 42.6ms 44.9ms 22.3 78.2KB 100.
The way to get speed improvements in these cases is to move all of the calculations into compiled code.
That said, there may well be other optimizations for your specific problem. You might want to provide a reproducible example, and ask a new question on Code Review about performance improvements.
Created on 2019-09-02 by the reprex package (v0.3.0.9000)
Upvotes: 2