Reputation: 764
I would like to generate a sample of size 20 from the multinomial distribution with three values such as 1,2 and 3
. For example, the sample can be like this sam=(1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1)
the following code is working but not getting the expected result
> rmultinom(20,3,c(0.4,0.3,0.3))+1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,] 1 1 3 2 2 1 1 2 3 2 3 2 1 2 2 3 1 2 2 2
[2,] 2 1 2 1 3 2 4 2 1 2 2 1 1 2 1 2 3 2 3 3
[3,] 3 4 1 3 1 3 1 2 2 2 1 3 4 2 3 1 2 2 1 1
I am not expecting this matrix. Any help is appreciated?
Upvotes: 2
Views: 12291
Reputation: 201
I would like to generate a sample of size 20 from the multinomial distribution
No problem, but you should remember that each sample is a vector, e.g. if you roll three dice you can get (2,5,1), or (6,2,4), or (3,3,3) etc.
You should also remember that in rmultinom(n, size, prob)
"n" is the sample size, and "size" is the total number of objects that are put into K boxes (when you roll three dice, the size is 3 and K=6).
with three values such as 1,2 and 3.
No problem, but you should remember that rmultinom
will return the count of each value, i.e. you could think of your three values as of row names (your three values could be "red, green, blue", "left, middle, right", etc.)
> rmultinom(n=20, size=3, prob=c(0.4,0.3,0.3))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,] 2 1 1 2 1 1 3 1 1 3 2 1 0 0 0 1 3 2 2 1
[2,] 1 1 1 1 0 1 0 1 2 0 1 2 2 2 1 1 0 0 1 0
[3,] 0 1 1 0 2 1 0 1 0 0 0 0 1 1 2 1 0 1 0 2
In the first sample (first column) "1" occurs 2 times, "2" occurs 1 time, "3" occurs 0 times. In the second and third samples each value occurs 1 time,... in the seventh sample "1" occurs 3 times etc.
Since you are putting three (size=3
) objects into K=3 boxes (there are as many boxes as the length of the prob
vector), the sum of each column is the number of your objects.
For example, the sample can be like this
sam=(1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1)
This does not look like a sample of size 20, because the outcome of a single multinomial trial is a vector, not a number.
Let's return to dice. I roll size=3
dice:
> rmultinom(n=1, size=3, prob=rep(1/6,6))
[,1]
[1,] 0
[2,] 2
[3,] 0
[4,] 0
[5,] 1
[6,] 0
I get two "2"s and one "5". This is a sample of size 1. Here is a sample of size 10:
> rmultinom(n=10, size=3, prob=rep(1/6,6))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 1 0 0 0 0 1 1 1
[2,] 1 1 0 3 0 0 1 1 1 1
[3,] 2 1 0 0 0 0 2 0 0 0
[4,] 0 0 2 0 1 1 0 1 0 1
[5,] 0 0 0 0 1 2 0 0 1 0
[6,] 0 1 0 0 1 0 0 0 0 0
HTH
Upvotes: 6
Reputation: 20080
How about
q <- rmultinom(20,2,c(0.4,0.3,0.3))+1
UPDATE
If one still want to follow multinomial PMF and have higher frequency of larger values, there is another variant
q <- 3 - rmultinom(20,2,c(0.4,0.3,0.3))
Upvotes: -1
Reputation: 145745
Your code does 20 draws of size 3 (each) from a multinomial distribution---this means that you will get a matrix with 20 columns (n = 20
) and 3 rows (length of your prob
argument = 3), where the sum of each row is also 3 (size = 3). The classic interpretation of a multinomial is that you have K
balls to put into size
boxes, each with a given probability---the result shows you many balls end up in each box. Your code add 1 to everything, so it's as if each box already has 1 ball in it, to the sum of each row will actually be 6.
Your comments, and your description of the result you want doesn't sound like you care about "balls and boxes". It sounds like you want to draw 20 numbers, with replacement, from the set {1, 2, 3}
. If this is the case, use sample
:
sample(1:3, size = 20, replace = TRUE, prob = c(0.4,0.3,0.3))
Upvotes: 0