Reputation: 99408
Suppose I have two vectors b
and a
. The components of the latter (a
) are almost always zero except a few.
If I want to compute component-wise product of a and a component-wise function (such as exp) of b, I can do
a*exp(b)
However for those majority zero components of a, the evaluation of exp on the corresponding components of b will be a waste.
I was wondering under cases such as this one, is it possible to program more efficiently in R? Or there is no need to change. Thanks!
Upvotes: 0
Views: 552
Reputation: 174778
To expand on DWin's answer, and your comment to it, just keep track of the 0
and add back in the trivial answers:
## Dummy data
set.seed(1)
a <- sample(0:10, 100, replace = TRUE)
b <- runif(100)
## something to hold results
out <- numeric(length(a))
## the computations you *want* to do
want <- !a==0
## fill in the wanted answers
out[want] <- a[want] * exp(b[want])
Which gives the correct results:
> all.equal(out, a * exp(b))
[1] TRUE
If you wanted, you could wrap this into a function:
myFun <- function(a, b) {
out <- numeric(length(a))
want <- !a==0
out[want] <- a[want] * exp(b[want])
return(out)
}
Then use it
> all.equal(out, myFun(a, b))
[1] TRUE
But none of this is more efficient than using a * exp(b)
directly. Both *
and exp()
are vectorised so will run very quickly, much more quickly than any of the booking keeping measures used in the various answers so far.
Whether you need the book-keeping solutions will depend on how expensive your function (exp()
in the example in your Q) is in compute terms. Try both approaches on a small sample and evaluate the timings (using system.time()
) to see if it is worth the extra effort of doing the subsetting to track the 0.
Upvotes: 2
Reputation: 42872
Just replace your expression with:
ifelse(a==0,0,a*exp(b))
I'd be surprised if this made a performance improvement, though, since R is interpreted, the overhead of running the ifelse
is probably worse than wasting the exp
invocation.
Upvotes: 2
Reputation: 3473
Similar to DWin's suggestion:
> n <- 1e5
> nonzero <- .01
> b <- rnorm(n)
> a <- rep(0, n)
> a[1:(n*nonzero)] <- rnorm(n*nonzero)
>
> system.time(replicate(100, {
+ c <- a*exp(b)
+ }))
user system elapsed
1.19 0.05 1.23
> system.time(replicate(100, {
+ zero <- a < .Machine$double.eps
+ c <- a
+ c[!zero] <- a[!zero]*exp(b[!zero])
+ }))
user system elapsed
0.42 0.08 0.50
Upvotes: 1
Reputation: 263332
You could accomplish that by indexing both vectors with a test for whatever situation you deem a waste. If the function is more time costly than exp, it might make a difference:
a[ !b==0 ]*exp( b[!b==0] )
Also recognize that there are traps to testing for equality with numeric mode. You may want to look at zapsmall and all.equal as alternatives depending on what the real problem is.
> 3/10 == 0.1*3
[1] FALSE
Upvotes: 0