Joker312
Joker312

Reputation: 59

Generating randomly the probabilities that sum to 1

I am trying to generate three probabilities that will necessarily sum to 1. I tried the following:

x <- sample(seq(0.05,0.95,0.05), size=3)
x <- boom/sum(boom)

However, I want the generated numbers to be fractions to two decimal places, i.e, numbers from 0.05 to 0.95 in steps of 0.05 (and so that they also sum to 1). The problem with the numbers generated by the code above is that they can be with many or infinite decimal places. How can I achieve what I need? Has anyone done anything similar?

Upvotes: 1

Views: 746

Answers (3)

Mohamed Desouky
Mohamed Desouky

Reputation: 4425

Try this

fn <- function(){
    s <- seq(0.05,0.90,0.05)
    sm <- sample(s , size = 2)
    while(sm[1] + sm[2] >= 1) sm <- sample(s , size = 2)
    ans <- c(sm , 1 - sum(sm))
    ans
}

by calling fn() you get the required sample

fn()

#> 0.20 0.65 0.15

fn()

#> 0.40 0.55 0.05

Upvotes: 0

Lars Lau Raket
Lars Lau Raket

Reputation: 1974

You can make many distributions over the discrete simplex you describe. Here is an alternative and generalizable way of generating a uniform distribution across all triples (p1, p2, p3) in your simplex:

p <- seq(0.05, 0.95, by = 0.05)
states <- expand.grid(p, p, p) # Generate all combinations
states <- states[rowSums(round(states, 2)) == 1, ] # Select simplex where elements sum to 1

states[sample(1:nrow(states), 1), ] # Sample a random element

This should work well for triples, but if you want to simulate from higher order simplexes (e.g. 7 probabilities or more) one should probably construct the data frame of states in a more efficient way.

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173793

You could do

diff(c(0, sort(sample(seq(0.05, 0.95, 0.05), 2)), 1))
#> [1] 0.05 0.75 0.2

This works by choosing 2 random double-digit numbers between 0.05 and 0.95 and sorting them. The first number is the first sample, the second is the distance between the two numbers, and the third is the distance between the second number and 1, so they necessarily add to 1.

Note that if all the numbers have to add up to 1, and none can be lower than 0.05, this means none of the numbers can be 0.95.

For example, if we want ten such samples, we can do:

t(replicate(10, diff(c(0, sort(sample(seq(0.05, 0.95, 0.05), 2)), 1))))
#>       [,1] [,2] [,3]
#>  [1,] 0.25 0.15 0.60
#>  [2,] 0.25 0.25 0.50
#>  [3,] 0.30 0.05 0.65
#>  [4,] 0.25 0.50 0.25
#>  [5,] 0.50 0.05 0.45
#>  [6,] 0.45 0.20 0.35
#>  [7,] 0.10 0.85 0.05
#>  [8,] 0.45 0.50 0.05
#>  [9,] 0.15 0.40 0.45
#> [10,] 0.40 0.30 0.30

Created on 2022-06-17 by the reprex package (v2.0.1)

Upvotes: 5

Related Questions