Shakil
Shakil

Reputation: 85

Speed up R performance or convert R function to C++ function

I have an R function to compute the probability of the integer k; k = 1, ....., m from the given values of the other parameters. When the size of the k is very large (e.g., m = 10,000), the function is very slow. Do you have any suggestions to improve the performance of the function?

I also want to create an equivalent function in C++ if needed so that I can use RCPP package from R, but I do not know C++. Before learning C++ from the scratch, I also would like to have your suggestions.

prob <- function(k, et, ey, nrep = 10000, m0, m1)
{
    m = m0 + m1
    t <- rnorm(nrep, et, 1)
    p0 <- pnorm(-t)
    p1 <- pnorm(ey - t)

    mean0 <- (m0 - 1)*p0 + m1*p1 + 1
    mean1 <- m0*p0 + (m1 - 1)*p1 + 1

    var0 <- (m0 - 1)*p0*(1 - p0) + m1*p1*(1 - p1)
    var1 <- m0*p0*(1 - p0) + (m1 - 1)*p1*(1 - p1)

    prob <- ifelse(et == 0, mean(dnorm(k, mean0, sqrt(var0))),
                   mean(dnorm(k, mean1, sqrt(var1))))

    return(prob)
}

apply the function

prob_k <- sapply(1:10000, prob, et=1, ey=1 ,m0 = 5000, m1 = 5000)

Upvotes: 0

Views: 148

Answers (1)

Gregor de Cillia
Gregor de Cillia

Reputation: 7685

Since dnorm is a vectorized function, you can just call prob like

prob_k <- prob(1:10000, et = 1, ey = 1 ,m0 = 5000, m1 = 5000)

To get verctorized outputs, you will have to change the code of the function a bit though

prob <- function(k, et, ey, nrep = 10000, m0, m1)
{
  m = m0 + m1
  t <- rnorm(nrep, et, 1)
  p0 <- pnorm(-t)
  p1 <- pnorm(ey - t)

  mean0 <- (m0 - 1)*p0 + m1*p1 + 1
  mean1 <- m0*p0 + (m1 - 1)*p1 + 1

  var0 <- (m0 - 1)*p0*(1 - p0) + m1*p1*(1 - p1)
  var1 <- m0*p0*(1 - p0) + (m1 - 1)*p1*(1 - p1)

  if( et == 0 )
    dnorm(k, mean0, sqrt(var0))
  else
    dnorm(k, mean1, sqrt(var1))
}

This is pretty fast on my machine (5 milliseconds on average)

microbenchmark(prob_k <- prob(1:10000, et = 1, ey = 1 ,m0 = 5000, m1 = 5000))
# Unit: milliseconds
#                                                           expr     min
#  prob_k <- prob(1:10000, et = 1, ey = 1, m0 = 5000, m1 = 5000) 4.68232
#        lq     mean   median       uq      max neval
#  4.776912 5.168405 4.817979 4.907612 7.023989   100

Upvotes: 1

Related Questions