Panda
Panda

Reputation: 43

Binomial distribution by row using R

There are two columns, ID and probability

        ID probability
        1  0.5
        2  0.8
        3  0.3

I would like to simulate the sickness status of each ID with 0 for healthy and 1 for sick. The probability of each ID getting sick is in the second column.

I have tried

df$sick <- rbinom(1,1,df$probability)

but I get either all zeros or ones. What am I doing wrong? Thank you in advance for your help!

Upvotes: 2

Views: 1942

Answers (1)

Anders Ellern Bilgrau
Anders Ellern Bilgrau

Reputation: 10223

Your problem is, that you only set n to 1, and so rbinom only return one value which is reused for all rows (by R's standard reuse rules). See ?rbinom. Something like this should do the trick:

df <- read.table(header = TRUE, text = "ID probability
    1  0.5
    2  0.8
    3  0.3")

df$sick <- rbinom(n = nrow(df), size = 1, prob = df$probability)
print(df)
#  ID probability sick
#1  1         0.5    1
#2  2         0.8    1
#3  3         0.3    0

Upvotes: 7

Related Questions