merlin2011
merlin2011

Reputation: 75545

Is there a concise (built-in) way to sample an index from an array by treating values as a probabilities?

Suppose I have a vector of probabilities that sum to 1, such as foo = c(0.2,0.5,0.3).

I would like to sample an index from this vector by treating the values as probabilities. In particular, I'd like to sample 1 with probability 0.2, 2 with probability 0.5, and 3 with probability 0.3.

Here is one implementation, similar to what I would write in C:

sample_index = function(probs) {
    r = runif(1)
    sum = 0
    for (i in 1:length(probs)) {
        sum <- sum + probs[i] 
        if (r < sum) return(i)
    }
}
foo = c(0.2,0.5,0.3)
print(sample_index(foo));

Is there a more direct / built-in / canonical way to do this in R?

Upvotes: 0

Views: 50

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

It always makes me smile and think R is doing a good job when people are looking for a function and repeatedly use its name in their question.

foo <- c(0.2, 0.5, 0.3)
sample(x = 1:3, size = 1, prob = foo)

Depending on your use case, you could make it a little more general:

sample(x = seq_along(foo), size = 1, prob = foo)

But do be careful, sample has sometimes convenient but very often unexpected behavior if its x argument is of length 1. If you're wrapping this up in a function, check the input length

if (length(foo) == 1) foo else sample(x = seq_along(foo), size = 1, prob = foo)

Upvotes: 5

Related Questions