Reputation: 45
I have an array with 3 elements and I want to pick one and add that into another array base on possibility.
For example, num 1 has 5% chance to be picked, num 2 has 60% chance to be picked and num 3 has 35% chance to be picked.
arr = [{:num=>1, :diff=>-29}, {:num=>2, :diff=>5}, {:num=>3, :diff=>25}]
I found below methods from stackoverflow, just wondering if this would work? Or there is another way to do it?
def get_num(arr)
case rand(100) + 1
when 1..5
p arr[0]
when 6..65
p arr[1]
when 66..100
p arr[2]
end
end
get_num(arr)
Thanks!
Upvotes: 2
Views: 516
Reputation: 110675
Your code is fine but here are two other approaches.
Use a cumulative distribution function ("CDF")
CDF = [[0.05,0], [0.05+0.60,1], [0.5+0.60+0.35,2]]
#=> [[0.05,0], [0.65,1], [1.0,2]]
def get_num(arr)
n = rand
arr[CDF.find { |mx,_idx| n <= mx }.last]
end
arr = [{:num=>1, :diff=>-29}, {:num=>2, :diff=>5}, {:num=>3, :diff=>25}]
get_num(arr)
#=> {:num=>2, :diff=>5}
get_num(arr)
#=> {:num=>2, :diff=>5}
get_num(arr)
#=> {:num=>3, :diff=>25}
get_num(arr)
#=> {:num=>1, :diff=>-29}
get_num(arr)
#=> {:num=>2, :diff=>5}
Suppose:
n = rand
#=> 0.5385005480168696
then
a = CDF.find { |mx,_idx| n <= mx }
#=> [0.65,1]
i = a.last
#=> 1
arr[i]
#=> {:num=>2, :diff=>5}
Note that I've followed the convention of beginning the name of find
's second block variable (_idx
) with an underscore to signal to the reader that that block variable is not used in the block calculation. Often just an underscore
(_
) is used.
Now consider the fraction of times each element of arr
will be randomly-drawn if n
draws are made:
def outcome_fractions(arr, n)
n.times
.with_object(Hash.new(0)) { |_,h| h[get_num(arr)] += 1 }
.transform_values { |v| v.fdiv(n) }
end
Randomly select from an array of indices
outcome_fractions(arr, 1_000)
#=> {{:num=>2, :diff=>5} =>0.612,
# {:num=>3, :diff=>25} =>0.328,
# {:num=>1, :diff=>-29}=>0.06}
outcome_fractions(arr, 100_000)
#=> {{:num=>3, :diff=>25} =>0.34818,
# {:num=>1, :diff=>-29}=>0.04958,
# {:num=>2, :diff=>5} =>0.60224}
Notice that the fraction of each hash that is randomly drawn approaches its specified population probability as the sample size is increased (though the "pseudo-random" draws are not truly random).
Do not be concerned with how outcome_fractions
works.
Here is another way that is more efficient (because it does not use find
, which performs a linear search) but uses more memory.
CHOICE = [*[0]*5, *[1]*60, *[2]*35]
#=> [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
# 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
# 2, 2, 2, 2, 2]
def get_num(arr)
arr[CHOICE[rand(100)]]
end
#=> {{:num=>2, :diff=>5} =>0.60029,
# {:num=>3, :diff=>25}=>0.35022,
# {:num=>1, :diff=>-29}=>0.04949}
Note that:
[*[0]*5, *[1]*60, *[2]*35]
produces the same array as
[[0]*5, [1]*60, [2]*35].flatten
The first *
in *[0]*5
is the splat operator; the second is the method Array#*. [0]*5 #=> [0,0,0,0,0]
is evaluated first.
CHOICE
has 100 elements. If the three probabilities were, say, 0.048
, 0.604
and 0.348
, CHOICE
would have 10**3 #=> 1_000
elements (48
zeros, 604
ones and 348
twos).
Upvotes: 6
Reputation: 114138
Here's a small variation / addition to Cary's great answer.
Instead of calculating the cumulative sums yourself, you can let Ruby build it for you out of the initial probabilities:
probs = [5, 60, 35]
sum = 0
sums = probs.map { |x| sum += x }
#=> [5, 65, 100]
we can now calculate a random number between 0 and the total sum and find the corresponding index:
r = rand(sum) #=> 37
sums.find_index { |i| r < i } #=> 1
Note that the initial probabilities don't have to sum to 100. instead of [5, 60, 35]
you could also use:
probs = [1, 12, 7]
You can wrap the above code into a method:
def random_index(*probs)
sum = 0
sums = probs.map { |x| sum += x }
r = rand(sum)
sums.find_index { |i| r < i }
end
random_index(5, 60, 35) #=> 1
random_index(5, 60, 35) #=> 1
random_index(5, 60, 35) #=> 2
You could also make the method return a proc / lambda that can be reused:
def random_index_proc(*probs)
sum = 0
sums = probs.map { |x| sum += x }
-> {
r = rand(sum)
sums.find_index { |i| r < i }
}
end
prc = random_index_proc(5, 60, 35)
prc.call #=> 1
prc.call #=> 1
prc.call #=> 0
Last not least, you can also pre-populate an array this way: (using Cary's naming convention)
CHOICE = [5, 60, 35].flat_map.with_index { |v, i| [i] * v }
and get a random element via:
def get_num(arr)
arr[CHOICE.sample]
end
To keep the array small, you should prefer [1, 12, 7]
(20 elements) over [5, 60, 35]
(100 elements). With a little help from gcd
you don't even have to calculate it yourself:
probs = [5, 60, 35]
gcd = probs.reduce { |a, b| a.gcd(b) }
#=> 5
probs.map { |i| i / gcd }
#=> [1, 12, 7]
Upvotes: 4