SC2
SC2

Reputation: 323

Simulate binomial distribution by column

I would like to simulate a binomial distribution of numbers 0 and 1 for each column of a matrix. The probability of success changes for each column. The probability of success for column 1 is 1/ncol(matrix). The probability of success for column 2 is 2/ncol(matrix).

I have tried a few different ways to generate the data I am looking for, but I either get warnings or a list of numbers for the 1st column and not for the whole matrix.

Even better would be having the ability to use a matrix of probabilities instead of something like 1/ncol(matrix).

Let's say I want a matrix of 200 rows by 1000 columns. For each column, I want to generate a binomial distribution of 0's and 1's using a probability defined by an element within a probability matrix. The probability of getting a 1 in column 1 defined by the first element of the probability matrix and is .001. The probability of getting a 1 in column 2 is defined by the second element of the probability matrix and is .002, etc. I'm not sure whether I should be doing rbinom() within a matrix() or within apply() or something else.

    datasim = matrix(0, nrow=200, ncol=1000)
    colnames(datasim) = c(1:1000)
    prob.matrix = as.matrix(c(1:1000))
    prob.matrix = prob.matrix/1000
    newdatasim = apply(???, 2, function(x) { rbinom(??, 1, prob.matrix)})

Please help, thank you!

Upvotes: 0

Views: 3775

Answers (2)

SC2
SC2

Reputation: 323

EDIT: Here is one of the solutions that I found. Frank's solution also works. Thank you!

#Generate matrix with 200 rows x 1000 columns
#Each column has a binomial distribution of 0's and 1's
#The probability is determined by the column number divided by the total number of columns
datasim = matrix(0, nrow=200, ncol=100)
datasim[1:10,1:10]
probmatrix = col(datasim)/1000 
probmatrix[1:10,1:10]
datasim2 = matrix(rbinom(200 * 1000,1,probmatrix), nrow=200, ncol=1000, byrow=FALSE)
datasim2[1:10,1:10]

Upvotes: 1

Frank
Frank

Reputation: 66819

It's simpler to use your prob.matrix as a vector than as a matrix. Here's a smaller example:

sapply(1:10/10,function(p) rbinom(20,1,p))

If you want to pre-allocate a matrix...

set.seed(42)
datasim     <- matrix(,20,10)
datasim[,]  <- sapply(1:10/10,function(p) rbinom(20,1,p))

which gives

datasim
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    1    1    0    1    1    0    1    1    0     1
 [2,]    1    0    0    1    0    1    1    1    1     1
 [3,]    0    1    0    1    0    1    1    1    1     1
 [4,]    0    1    1    0    1    1    1    1    1     1
 [5,]    0    0    0    1    1    0    0    1    1     1
 [6,]    0    0    1    0    1    0    1    0    1     1
 [7,]    0    0    1    0    0    0    0    1    1     1
 [8,]    0    1    0    1    0    0    0    0    1     1
 [9,]    0    0    1    1    0    1    1    1    1     1
[10,]    0    1    0    0    0    1    0    1    1     1
[11,]    0    0    0    0    1    0    0    1    1     1
[12,]    0    1    0    0    0    0    1    1    1     1
[13,]    1    0    0    0    0    0    1    1    1     1
[14,]    0    0    1    0    1    1    0    1    1     1
[15,]    0    0    0    0    1    1    0    1    1     1
[16,]    1    1    1    1    1    1    1    1    1     1
[17,]    1    0    0    0    0    1    0    1    1     1
[18,]    0    0    0    0    1    1    1    1    1     1
[19,]    0    1    0    0    1    0    0    1    1     1
[20,]    0    0    0    0    1    0    1    1    1     1

Upvotes: 4

Related Questions