Reputation: 2040
I have a set of integers each of which has a probability assigned, derived from earlier experiments, e.g.:
0 = 0.5
1 = 0.2
2 = 0.3
Complying with the specifications of a probability distribution, these weights sum up to 1.0. I am now looking for an efficient way to sample one of the values while taking the given probabilities into account, e.g. (pseude-code):
Distribution distribution = new DiscreteDistribution(new double[]{0.5, 0.3, 0.2});
distribution.sample();
This should result in 0 half of the time according to the given numbers. However, do not assume any patterns or regularities among these.
I've been using Apache Commons Math for my previous experiments, but it does not seem to provide a solution for this scenario, neither does Colt.
I wonder whether this is because I've missed an easy solution. A naive implemententation seems more or less straight-forward, but doing this efficiently is rather involved. That is why I am looking for an established implementation.
Upvotes: 8
Views: 5068
Reputation: 3194
Here is perhaps a more dynamic approach supporting any probability distribution specified as an array of doubles:
public static int getRandomOutcome(double[] probaDist) {
List<Double> sortedProbaDist = new ArrayList<>(probaDist.length);
for (double d : probaDist) { sortedProbaDist.add(d); }
Collections.sort(sortedProbaDist);
double randomNumber = Math.random();
double acc = 0;
for (int i=0; i<sortedProbaDist.size(); i++) {
acc += sortedProbaDist.get(i);
if (randomNumber < acc) {
return i;
}
}
return probaDist.length;
}
Note that the method does not check whether the probabilities sum up to (near) 1.
Upvotes: 0
Reputation: 1680
A very simple generic solution would be:
class Distribution<T>{
List<Double> probs = new ArrayList<>();
List<T> events = new ArrayList<>();
double sumProb;
Random rand = new Random();
Distribution(Map<T,Double> probs){
for(T event : probs.keySet()){
sumProb += probs.get(event);
events.add(event);
this.probs.add(probs.get(event));
}
}
public T sample(){
T value;
double prob = rand.nextDouble()*sumProb;
int i;
for(i=0; prob>0; i++){
prob-= probs.get(i);
}
return events.get(i-1);
}
}
Feel free to change it, as you need it, e.g. with adding other constructors. Of course here is a lot of stuff to improve, starting with the efficiency, but it is something you can reuse later a lot.
Upvotes: 4
Reputation: 533492
Calling Random.nextDouble()
is a fairly expensive operation. You are better off using Random.nextInt(n)
in this case
int num = rand.nextInt(10);
return num <= 5 ? 0 : num <= 8 ? 1 : 2;
Upvotes: 4
Reputation: 234655
Given the simplicity of the quantile function and the triviality of a manual implementation, I don't see any harm in writing this out explicitly.
Once you've drawn your random number r
in [0, 1), use
if (r <= 0.5/*micro-optimisation: most likely case first*/){
return 0;
} else if (r <= 0.8/*then the next most likely case*/){
return 2;
} else {
return 1;
}
Perhaps things get a little more fancy for more than 3 numbers, consider building up a table to represent the quantile function in such cases, at the expense of some degradation in performance.
(It would be difficult to beat my solution in terms of speed, in the worst case you have a couple of branches - and you're helping a branch predictor in the nicest way you possibly can, and the random number drawing will be where the performance bottleneck is).
Upvotes: 5