Eastern Monk
Eastern Monk

Reputation: 6645

Returning a random value from array with probability proportional to it's value

I have an array like

$keywords = array('apple'=>10,'orange'=>2,'grape'=>12); 

I want to randomly pick one of the "Key" from the array. However the probability distribution should be such that probability of picking an element should be proportional to it's value.

Upvotes: 8

Views: 6156

Answers (3)

Peter Briers
Peter Briers

Reputation: 192

I'd do it like this:

    $probabilities = array('apple'=>50, 'orange'=>20, 'banana'=>10);

    function random_probability($probabilities) {
      $rand = rand(0, array_sum($probabilities));
      do {
        $sum = array_sum($probabilities);
        if($rand <= $sum && $rand >= $sum - end($probabilities)) {
          return key($probabilities);
        }
      } while(array_pop($probabilities));
    }

Upvotes: 2

Oliver Coleman
Oliver Coleman

Reputation: 794

An O(log(n)) approach (this is ripped directly from an answer to a very similar question):

The usual technique is to transform the array into an array of cumulative sums:

 [10 60 5 25]  --> [10 70 75 100]

Pick a random number in the range from zero up to the cumulative total (in the example: 0 <= x < 100). Then, use bisection on the cumulative array to locate the index into the original array:

Random variable x      Index in the Cumulative Array      Value in Original Array
-----------------      -----------------------------      ----------------------
 0 <= x < 10                      0                            10
10 <= x < 70                      1                            60
70 <= x < 75                      2                             5
75 <= x < 100                     3                            25 

For example, if the random variable x is 4, bisecting the cumulative array gives a position index of 0 which corresponds to 10 in the original array.

And, if the random variable x is 72, bisecting the cumulative array gives a position index of 2 which corresponds to 5 in the original array.

Upvotes: 1

Kerrek SB
Kerrek SB

Reputation: 477228

Add all values (10+2+12 is 24); get a random number in the range [0, 24), and pick the corresponding element depending on whether the number lies in [0, 10), [10, 12), or [12, 24).

Upvotes: 16

Related Questions