Reputation: 1496
Let's suppose I have a structured array like that:
[
'A' => 2,
'B' => 0,
'C' => 0,
'D' => 1,
'E' => 1,
'F' => 0
]
I'll refer to this structure as "categories", so, I have six categories in this array. And my goal is to pick a random product based in a category.
I want to do a rate-based category selection, and as I know, I have to calculate how many percent this category represents in array, for example:
<?php
// ...
$total = array_sum($a);
array_map(function ($hits) use ($total) {
return $hits / $total;
}, ...);
This will give me something like that:
(
[A] => 0.5 (50%)
[B] => 0
[C] => 0
[D] => 0.25 (25%)
[E] => 0.25 (25%)
[F] => 0
)
Okay, now I have to do a simple algorithm to get the category based on those rates; I think I need now to pick a random number between range (0, 1)
, and make some "slices", for example:
0 .. 0.50 => A
0.50 .. 0.25 => D
0.75 .. 1 => E
And if the random number is between 0
and 0.50
, I will pick category A
, if between 0.50
and 0.75
then D
, if between 0.75
and 1
then E
, of course, is what I'm doing right now.
If I go this way, I'm totally saying mathematically and logically that I'll never get B
, C
nor F
, because there is no hits on those categories (no slices then.)
How I can avoid this? I have to give some chances to those categories, but minimal (which means is not impossible).
Upvotes: 0
Views: 132
Reputation: 7195
You can use a distribution array which will have each category repeated hits
times. Then you can simply get a random element from that array.
Somehow like this:
$distr = array();
array_walk($a, function ($hits, $cate) use ($distr) {
$distr = array_merge($distr, array_fill(0, $hits, $cate));
});
$index = mt_rand(0, count($distr) - 1);
$random_cate = $distr[$index];
Upvotes: 2
Reputation: 79521
What you have is a random variable X that will take one of the values in S = {A, B, C, D, E, F}.
P(X = A) = 1/2
P(X = B) = 0
etc.
Define a new uniform random variable Y for which P(Y = A) = P(Y = B) = ... = P(Y = F) = 1 / |S| and a random variable Z = X if T = 0 and Z = Y if T = 1, where T is a Bernoulli random variable with P(T = 1) = t and P(T = 0) = 1 - t.
Then for all s in S,
P(Z = s) = P(Z = s | T = 0) P(T = 0) + P(Z = s | T = 1) P(T = 1) = (1 - t) P(X = s) + t / |S|
Using this model, all you need to choose is the parameter t in [0,1], where t indicates the probability of a uniform random selection out of S. t = 0 is your current model, where B, C, and F will never occur.
Upvotes: 1