bill_080
bill_080

Reputation: 4760

Modified cumsum function

Below is a simplified version of a segment of code that I'm working on (a lot of additional calculations are left out to avoid confusion). It's just a modified form of the cumsum function. I don't want to re-invent the wheel, so does this function already exist? If not, what scheme would provide the best speed?

#Set up the data   
set.seed(1)   
junk <- rnorm(1000000)   
junk1 <- rnorm(1000000)   
cumval <- numeric(1000000)   

#Initialize the accumulator   
cumval[1] <- 1   

#Perform the modified cumsum
system.time({   
for (i in 2:1000000) cumval[i] <- junk[i] + (junk1[i] * cumval[i-1])       
})   

#Plot the result
plot(cumval, type="l")    

Upvotes: 1

Views: 234

Answers (3)

Tommy
Tommy

Reputation: 40821

This algorithm is something that fits the compiler package perfectly!

#Set up the data   
set.seed(1)   
junk <- rnorm(1000000)   
junk1 <- rnorm(1000000)

# The original code
f <- function(junk, junk1) {
  cumval <- numeric(1000000)
  cumval[1] <- 1
  for (i in 2:1000000) cumval[i] <- junk[i] + (junk1[i] * cumval[i-1])
  cumval
}
system.time( f(junk, junk1) ) # 4.11 secs

# Now try compiling it...
library(compiler)
g <- cmpfun(f)
system.time( g(junk, junk1) ) # 0.98 secs

...so it would be interesting to know if this algorithm is in any way "typical" - in that case the compiler could perhaps be even more optimized for situations like this...

Upvotes: 1

IRTFM
IRTFM

Reputation: 263362

Consider cumval[5]. Using j[] for junk and jk[] for junk1 and omitting * symbols, its expansion would be:

j[5] +jk[5]j[4] + jk[5]jk[4]j[3] + jk[5]jk[4]jk[3]j[2] + jk[5]jk[4]jk[3]jk[2]

The pattern suggests this might be (close to ?) an expression for the 5th term:

    sum(  j[1:5] * c(1, Reduce("*" , rev(jk[2:5]), accumulate=TRUE) )

Upvotes: 1

Berend Hasselman
Berend Hasselman

Reputation: 11

It is faster but doesn't give correct results. Run this

set.seed(1)

N <- 10

junk  <- rnorm(N)

junk1 <- rnorm(N)

cumval <- numeric(N)
cumval.1 <- numeric(N)
cumval[1] <- 1

for( i in 2:N ) cumval[i] <- junk[i] + junk1[i]*cumval[i-1]
cumval

cumval.1 <- cumsum( junk[-1] + (junk1[-1] * cumval.1[-N]) ) 

cumval.1

and you'll see that cumval and cumval.1 are not even the same length.

One needs to rewrite the recurrence relation. I don't see a way to convert the recurrence to a non recurrence formula.

Upvotes: 1

Related Questions