Reputation: 13025
I have a set of 1000 elements, and would like to put 200 in subset1, 300 in subset2, and 500 in subset3. All of the element are equivalent with each other in terms of their assignment probability. How can do that in R? My current approach is that first choose 200 by random, and put them into subset1. Afterwards, I will randomly pick 300 from the remaining 800. I do not think it is exactly correct.
I think the correct approach is re-order the 1000 element sequence by random, and select the first 200, then the second 300, and the remaining 500. But I do not how how to do that in R.
Upvotes: 0
Views: 706
Reputation: 44525
This is a slightly different version of what @Didzis has proposed that uses split
to return a list of three vectors (or something else, if x
was something else):
Using rep
to get exactly 200, 300, and 500 elements:
split(sample(x),rep(1:3,times=c(200,300,500)))
Using the prob
argument of sample
to get 200, 300, and 500 elements in expectation:
split(x,sample(1:3,1000,replace=TRUE,prob=c(.2,.3,.5)))
You probably want the first of these.
Upvotes: 0
Reputation: 98449
You can use function sample()
to get "a random permutation" of your original data and then select first 200, then 300 and so on.
#original data
x<-runif(1000)
#random permutation
y<-sample(x)
#data selection
y[1:200]
y[201:500]
y[501:1000]
Upvotes: 3