Reputation: 109
In R I need to create a data set where there are 57 0's, 203 1's, 383 2's and so forth. I thought I would be able to create the data set from the probability distribution:
sample_dist <- sample(c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14), size = 2608,
replace = FALSE, prob = c(57/2608, 203/2608, 383/2608, 525/2608, 532/2608,
408/2608, 273/2608, 139/2608, 45/2608, 27/2608, 10/2608, 4/2608, 0/2608, 1/2608, 1/2608))
but this dosen't work. If I set replace = TRUE
I get a sample from the same distribution but with replacement, which does not yield exactly the data set that I want.
What am I doing wrong? Is the even a good approach to creating such a data set or do you have a more elegant approach?
Upvotes: 0
Views: 68
Reputation: 1376
Something like this:
ccc <- c(rep(0, 57), rep(1, 203), rep(2, 383)) #and so on
ccc <- sample(ccc) # shuffle values
cdf <- data.frame(r = ccc) # if you want data frame
Upvotes: 2