theresemoreau
theresemoreau

Reputation: 109

Create data set from probability distribution

In R I need to create a data set where there are 57 0's, 203 1's, 383 2's and so forth. I thought I would be able to create the data set from the probability distribution:

sample_dist <- sample(c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14), size = 2608, 
replace = FALSE, prob = c(57/2608, 203/2608, 383/2608, 525/2608, 532/2608,
 408/2608, 273/2608, 139/2608, 45/2608, 27/2608, 10/2608, 4/2608, 0/2608, 1/2608, 1/2608))

but this dosen't work. If I set replace = TRUE I get a sample from the same distribution but with replacement, which does not yield exactly the data set that I want. What am I doing wrong? Is the even a good approach to creating such a data set or do you have a more elegant approach?

Upvotes: 0

Views: 68

Answers (1)

franiis
franiis

Reputation: 1376

Something like this:

ccc <- c(rep(0, 57), rep(1, 203), rep(2, 383)) #and so on
ccc <- sample(ccc) # shuffle values
cdf <- data.frame(r = ccc) # if you want data frame

Upvotes: 2

Related Questions