Miron Alexandra
Miron Alexandra

Reputation: 31

Is there a way to make the following algo more efficient?

Given the following algorithm that returns a value from a predefined enumeration if a random input value u is less than some predefined probability in a (cumulative) prob vector

val <- 1:5
prob <- c(1/3,1/30,2/15,7/30,4/15)
u <- runif(1)
if (u<prob[1]) 
  { 
    x=val[1]
  } else if(u<prob[1]+prob[2])
  {
    x=val[2]
  } else if(u<prob[1]+prob[2]+prob[3])
  {
    x=val[3]
  } else if (u<prob[1]+prob[2]+prob[3]+prob[4])
  {
    x=val[4]
  } else
      x=val[5]
  }

Is there a way to make all this more efficient? I can't figure it out how to do it differently.

Upvotes: 3

Views: 78

Answers (3)

Ben Bolker
Ben Bolker

Reputation: 226172

I don't know whether this is specifically a question about an algorithm or whether you want to solve the problem faster.

sample(val, size = 1, prob = prob) would certainly be faster than your solution, but I don't know whether the internal algorithm is fundamentally any better (e.g. in terms of operation count) than yours.

Looking at the C source code you can see that the algorithm is like yours, except that it computes the cumulative probabilities up-front to avoid repeated addition (and permutes the elements into descending order — I think that's for numerical stability rather than efficiency).

(this is for sampling with replacement; the without-replacement code looks very similar, except that the machinery for removing previously sampled values makes it a little more complicated)

Upvotes: 6

MrFlick
MrFlick

Reputation: 206197

An alternative would be to use findInterval. For example

val <- 1:5
prob <- c(1/3,1/30,2/15,7/30,4/15)
u <- runif(1)
val[findInterval(u, cumsum(c(0, prob)))]

This will also work for any number of values

u <- runif(1000)
val[findInterval(u, cumsum(c(0, prob)))]

Upvotes: 7

Ma&#235;l
Ma&#235;l

Reputation: 51974

You can use cumsum and which:

set.seed(3)
u <- runif(1)
#[1] 0.702374

val[which(u < cumsum(prob))[1]]
#[1] 4

Upvotes: 3

Related Questions