user528676
user528676

Reputation: 197

Implementing probability distribution function in Java

I'm trying to implement a probability distribution function in java where it returns the ith entry in the array with probability:

Fi = 6i(n-i) / (n3 - n)

where n is the array length i.e. for an array length 4:

P1 = 3/10, P2 = 4/10, P3 = 3/10, P4 = 0

Note that this function assumes numbering from 1 to n rather 0 to n-1 as in Java.

At the moment I'm just using the uniform distribution i.e.

 int i = (int)(Math.random()*((arraySize)-1));

with the -1 so it doesn't choose the last element (i.e. Pn = 0 as in the above formula).

Anyone with any ideas or tips on implementing this?

Upvotes: 6

Views: 7147

Answers (4)

Dunes
Dunes

Reputation: 40683

You could try using a navigable map with the probability distribution. Unlike normal Maps the NaviableMap defines an absolute ordering over its keys. And if the key isn't present in the map it can tell you which is the closest key, or which is the smallest key that is greater than the argument. I've used ceilingEntry which returns the map entry with the smallest key that is greater than or equal to the given key.

If you use a TreeMap as your implementation of NavigableMap then look ups on distributions with many classes will be faster as it performs a binary search rather than starting with the first key and then testing each key in turn.

The other advantage of NaviableMap is that you get the class of data your directly interested in rather than an index to another array or list, which can make code cleaner.

In my example I've used BigDecimals as I'm not particularly fond of using floating point numbers as you can't specify the precision you need. But you could use floats or doubles or whatever.

import java.math.BigDecimal;
import java.math.RoundingMode;
import java.util.Arrays;
import java.util.NavigableMap;
import java.util.TreeMap;


public class Main {

    public static void main(String[] args) {

        String[] classes = {"A", "B", "C", "D"};
        BigDecimal[] probabilities = createProbabilities(classes.length);
        BigDecimal[] distribution = createDistribution(probabilities);

        System.out.println("probabilities: "+Arrays.toString(probabilities));
        System.out.println("distribution: "+Arrays.toString(distribution)+"\n");

        NavigableMap<BigDecimal, String> map = new TreeMap<BigDecimal, String>();

        for (int i = 0; i < distribution.length; i++) {
            map.put(distribution[i], classes[i]);
        }

        BigDecimal d = new BigDecimal(Math.random());

        System.out.println("probability: "+d);

        System.out.println("result: "+map.ceilingEntry(d).getValue());

    }

    private static BigDecimal[] createDistribution(BigDecimal[] probabilities) {
        BigDecimal[] distribution = new BigDecimal[probabilities.length];

        distribution[0] = probabilities[0];
        for (int i = 1; i < distribution.length; i++) {
            distribution[i] = distribution[i-1].add(probabilities[i]); 
        }
        return distribution;
    }

    private static BigDecimal[] createProbabilities(int n) {
        BigDecimal[] probabilities = new BigDecimal[n];

        for (int i = 0; i < probabilities.length; i++) {
            probabilities[i] = F(i+1, n);
        }
        return probabilities;
    }

    private static BigDecimal F(int i, int n) {
//      6i(n-i) / (n3 - n)
        BigDecimal j = new BigDecimal(i);
        BigDecimal m = new BigDecimal(n);
        BigDecimal six = new BigDecimal(6);

        BigDecimal dividend = m.subtract(j).multiply(j).multiply(six);
        BigDecimal divisor = m.pow(3).subtract(m);

        return dividend.divide(divisor, 64, RoundingMode.HALF_UP);
    }
}

Upvotes: 1

Ricky Bobby
Ricky Bobby

Reputation: 7608

double rand = Math.random(); // generate a random number in [0,1]
F=0;
// you test if rand is in [F(1)+..+F(i):F(1)+..+F(i)+F(i+1)] it is in this rnge with proba P(i) and therefore if it is in this range you return i
for (int i=1,i<array.size();i++ ){
   F+=F(i);
   if rand < F
       return i;
}
return array.size(); // you went through all the array then rand==1 (this probability is null) and you return n

Upvotes: 2

thomson_matt
thomson_matt

Reputation: 7691

To do this, you want to divide the range [0, 1] into regions that have the required size. So in this case:

0 -> 0.0 - 0.3
1 -> 0.3 - 0.7
2 -> 0.7 - 1.0
3 -> 1.0 - 1.0

Then generate a random number with Math.random(), and see which interval it falls into.

In general, you want to do something like the following pseudocode:

double r = Math.random();
int i = -1;

while (r >= 0)
{
  i++;
  r -= F(i);
}

// i is now the value you want.

You generate a value on [0, 1], then subtract the size of each interval until you go below 0, at which point you've found your random value.

Upvotes: 1

YXD
YXD

Reputation: 32511

This is essentially what thomson_matt says, but a little more formally: You should perform discrete inverse transform sampling. Pseudocode for your example:

p = [0.3, 0.4, 0.3. 0.0]
c = [0.3, 0.7, 1.0, 1.0] // cumulative sum

generate x uniformly in continuous range [0,1]
find max i such that c[i] < x.

Upvotes: 2

Related Questions