Reputation: 323
I would like to simulate a binomial distribution of numbers 0 and 1 for each column of a matrix. The probability of success changes for each column. The probability of success for column 1 is 1/ncol(matrix)
. The probability of success for column 2 is 2/ncol(matrix)
.
I have tried a few different ways to generate the data I am looking for, but I either get warnings or a list of numbers for the 1st column and not for the whole matrix.
Even better would be having the ability to use a matrix of probabilities instead of something like 1/ncol(matrix).
Let's say I want a matrix of 200 rows by 1000 columns. For each column, I want to generate a binomial distribution of 0's and 1's using a probability defined by an element within a probability matrix. The probability of getting a 1 in column 1 defined by the first element of the probability matrix and is .001. The probability of getting a 1 in column 2 is defined by the second element of the probability matrix and is .002, etc. I'm not sure whether I should be doing rbinom()
within a matrix()
or within apply()
or something else.
datasim = matrix(0, nrow=200, ncol=1000)
colnames(datasim) = c(1:1000)
prob.matrix = as.matrix(c(1:1000))
prob.matrix = prob.matrix/1000
newdatasim = apply(???, 2, function(x) { rbinom(??, 1, prob.matrix)})
Please help, thank you!
Upvotes: 0
Views: 3775
Reputation: 323
EDIT: Here is one of the solutions that I found. Frank's solution also works. Thank you!
#Generate matrix with 200 rows x 1000 columns
#Each column has a binomial distribution of 0's and 1's
#The probability is determined by the column number divided by the total number of columns
datasim = matrix(0, nrow=200, ncol=100)
datasim[1:10,1:10]
probmatrix = col(datasim)/1000
probmatrix[1:10,1:10]
datasim2 = matrix(rbinom(200 * 1000,1,probmatrix), nrow=200, ncol=1000, byrow=FALSE)
datasim2[1:10,1:10]
Upvotes: 1
Reputation: 66819
It's simpler to use your prob.matrix
as a vector than as a matrix. Here's a smaller example:
sapply(1:10/10,function(p) rbinom(20,1,p))
If you want to pre-allocate a matrix...
set.seed(42)
datasim <- matrix(,20,10)
datasim[,] <- sapply(1:10/10,function(p) rbinom(20,1,p))
which gives
datasim
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 0 1 1 0 1 1 0 1
[2,] 1 0 0 1 0 1 1 1 1 1
[3,] 0 1 0 1 0 1 1 1 1 1
[4,] 0 1 1 0 1 1 1 1 1 1
[5,] 0 0 0 1 1 0 0 1 1 1
[6,] 0 0 1 0 1 0 1 0 1 1
[7,] 0 0 1 0 0 0 0 1 1 1
[8,] 0 1 0 1 0 0 0 0 1 1
[9,] 0 0 1 1 0 1 1 1 1 1
[10,] 0 1 0 0 0 1 0 1 1 1
[11,] 0 0 0 0 1 0 0 1 1 1
[12,] 0 1 0 0 0 0 1 1 1 1
[13,] 1 0 0 0 0 0 1 1 1 1
[14,] 0 0 1 0 1 1 0 1 1 1
[15,] 0 0 0 0 1 1 0 1 1 1
[16,] 1 1 1 1 1 1 1 1 1 1
[17,] 1 0 0 0 0 1 0 1 1 1
[18,] 0 0 0 0 1 1 1 1 1 1
[19,] 0 1 0 0 1 0 0 1 1 1
[20,] 0 0 0 0 1 0 1 1 1 1
Upvotes: 4