Reputation: 764

Random number generation from multinomial distribution in R using rmultinom() function

I would like to generate a sample of size 20 from the multinomial distribution with three values such as 1,2 and 3. For example, the sample can be like this sam=(1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1)

the following code is working but not getting the expected result

> rmultinom(20,3,c(0.4,0.3,0.3))+1
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,]    1    1    3    2    2    1    1    2    3     2     3     2     1     2     2     3     1     2     2     2
[2,]    2    1    2    1    3    2    4    2    1     2     2     1     1     2     1     2     3     2     3     3
[3,]    3    4    1    3    1    3    1    2    2     2     1     3     4     2     3     1     2     2     1     1

I am not expecting this matrix. Any help is appreciated?

Upvotes: 2

Answers (3)

Sergio

Reputation: 201

I would like to generate a sample of size 20 from the multinomial distribution

No problem, but you should remember that each sample is a vector, e.g. if you roll three dice you can get (2,5,1), or (6,2,4), or (3,3,3) etc.
You should also remember that in rmultinom(n, size, prob) "n" is the sample size, and "size" is the total number of objects that are put into K boxes (when you roll three dice, the size is 3 and K=6).

with three values such as 1,2 and 3.

No problem, but you should remember that rmultinom will return the count of each value, i.e. you could think of your three values as of row names (your three values could be "red, green, blue", "left, middle, right", etc.)

> rmultinom(n=20, size=3, prob=c(0.4,0.3,0.3))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,]    2    1    1    2    1    1    3    1    1     3     2     1     0     0     0     1     3     2     2     1
[2,]    1    1    1    1    0    1    0    1    2     0     1     2     2     2     1     1     0     0     1     0
[3,]    0    1    1    0    2    1    0    1    0     0     0     0     1     1     2     1     0     1     0     2

In the first sample (first column) "1" occurs 2 times, "2" occurs 1 time, "3" occurs 0 times. In the second and third samples each value occurs 1 time,... in the seventh sample "1" occurs 3 times etc.
Since you are putting three (size=3) objects into K=3 boxes (there are as many boxes as the length of the prob vector), the sum of each column is the number of your objects.

For example, the sample can be like this sam=(1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1)

This does not look like a sample of size 20, because the outcome of a single multinomial trial is a vector, not a number.

Let's return to dice. I roll size=3 dice:

> rmultinom(n=1, size=3, prob=rep(1/6,6))
     [,1]  
[1,]    0
[2,]    2
[3,]    0
[4,]    0
[5,]    1
[6,]    0

I get two "2"s and one "5". This is a sample of size 1. Here is a sample of size 10:

> rmultinom(n=10, size=3, prob=rep(1/6,6))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    0    0    1    0    0    0    0    1    1     1
[2,]    1    1    0    3    0    0    1    1    1     1
[3,]    2    1    0    0    0    0    2    0    0     0
[4,]    0    0    2    0    1    1    0    1    0     1
[5,]    0    0    0    0    1    2    0    0    1     0
[6,]    0    1    0    0    1    0    0    0    0     0

HTH

Upvotes: 6

Severin Pappadeux

Reputation: 20080

How about

q <- rmultinom(20,2,c(0.4,0.3,0.3))+1

UPDATE

If one still want to follow multinomial PMF and have higher frequency of larger values, there is another variant

q <- 3 - rmultinom(20,2,c(0.4,0.3,0.3))

Upvotes: -1

Gregor Thomas

Reputation: 145745

Your code does 20 draws of size 3 (each) from a multinomial distribution---this means that you will get a matrix with 20 columns (n = 20) and 3 rows (length of your prob argument = 3), where the sum of each row is also 3 (size = 3). The classic interpretation of a multinomial is that you have K balls to put into size boxes, each with a given probability---the result shows you many balls end up in each box. Your code add 1 to everything, so it's as if each box already has 1 ball in it, to the sum of each row will actually be 6.

Your comments, and your description of the result you want doesn't sound like you care about "balls and boxes". It sounds like you want to draw 20 numbers, with replacement, from the set {1, 2, 3}. If this is the case, use sample:

sample(1:3, size = 20, replace = TRUE, prob = c(0.4,0.3,0.3))

Upvotes: 0

Random number generation from multinomial distribution in R using rmultinom() function

Answers (3)

Related Questions