Reputation: 645
I want to multiply all the elements of a matrix together. I can do it with two for loops or with apply. My intuition was that the for loops would be faster. Apply has to create a temporary vector to store the results of the product of rows, then apply product to that.
It still has to execute for loops to multiply all the elements so it's just an extra operation of storing the intermediate results that the for loops approach doesn't have to do. Yet it's still about 4 times faster. Why is that?
cols <- 1000
rows <- 1000
a <- matrix(runif(cols * rows, 1, 2), nrow = rows)
system.time({
result <- 1
for(i in 1:nrow(a)) {
for(j in 1:ncol(a)) {
result <- result * a[i, j]
}
}
})
# 0.09s
system.time(result <- prod(apply(a, 1, prod)))
# 0.01s
Upvotes: 1
Views: 428
Reputation: 263411
Here's what I got with an effort to benchmark various methods. I have some concerns about the fact that Inf was the result of many of the calculations, and I wonder if a restriction to a range of 0-1 might be different. Like @badmax I was surprised that prod(a)
was relatively slow. Seemed to me that it should have been coded in C and be more efficient. I also reasoned that a column oriented approach might be faster than a row oriented approach since that is how matrices in R are stored and was correct:
library(microbenchmark)
cols <- 1000
rows <- 1000
a <- matrix(runif(cols * rows, 1, 2), nrow = rows)
microbenchmark(loop1 ={
result <- 1
for(i in 1:nrow(a)) {
for(j in 1:ncol(a)) {
result <- result * a[i, j]
} } },
loop2 ={result <- 1
for(j in 1:ncol(a)) {
result <- result * prod(a[ , j])
} },
loop3 = {
result <- 1
for(i in 1:nrow(a)) {
result <- result * prod( a[i, ])
} },
apply_test = {result <- prod(apply(a, 1, prod))},
prod_test = {result <- prod(a) },
Reduce_test = {result <- Reduce("*", a)},
log_sum = { result<- exp( sum(log(a)))}) #since sum of logs == log of prod
#====================
Unit: milliseconds
expr min lq mean median uq max neval cld
loop1 58.872740 59.782277 60.665321 60.246169 61.156176 67.33558 100 c
loop2 5.314437 5.843748 7.316167 6.024948 6.626402 57.36532 100 a
loop3 9.614727 10.248335 11.521343 10.541872 10.947829 45.08280 100 ab
apply_test 8.336721 8.924148 9.960122 9.166424 9.429118 17.35621 100 ab
prod_test 94.314333 95.438939 95.956394 95.911858 96.286444 98.54229 100 d
Reduce_test 292.907175 312.754959 389.959756 354.369616 511.151578 545.80829 100 e
log_sum 19.258281 19.916965 22.333617 20.134510 20.551704 180.18492 100 b
I think the apply_test
outcome is essentially doing the same as loop2
, perhaps with a bit of overhead penalty for the apply_test
. Here are the results for a test case where the range of random values were restricted to [0-1] (instead of [1-2]) and they do confirm my suspicion that some of the difference lies in the handling of Inf values:
Unit: milliseconds
expr min lq mean median uq max neval cld
loop1 56.693831 58.846847 59.688896 59.448108 60.208619 63.005431 100 c
loop2 5.667955 5.907125 10.090634 6.109151 6.532552 183.985617 100 ab
loop3 9.533779 10.080330 12.760057 10.431867 10.734991 183.262217 100 ab
apply_test 8.144015 8.622861 9.940263 8.904425 9.962390 17.470028 100 ab
prod_test 1.327710 1.371449 1.411990 1.394160 1.432646 1.677596 100 a
Reduce_test 293.697339 312.384739 385.437743 356.997439 500.446356 557.253762 100 d
log_sum 22.444015 23.224879 24.064932 23.539085 24.210656 29.774315 100 b
The prod
function is now rescued from its lowly position relative to the loops and apply methods.
Upvotes: 3
Reputation: 2443
To vectorize instead of apply
you can use rowProds
from matrixStats
:
library(matrixStats)
microbenchmark::microbenchmark({AA = prod(rowProds(a))}, times = 10)
takes around 18 milliseconds
Upvotes: 1
Reputation: 2188
It looks like apply
is faster because there is still some vectorization going on. Consider this for
loop:
system.time({
result <- 1
for(i in 1:nrow(a)) {
result <- result * prod(a[i,])
}
})
which, for me, is going about as fast the apply
.
Upvotes: 0