Reputation: 463
I have a data frame (d
) composed of 640 observations for 55 variables.
I would like to randomly sample this data frame in 10 sub data frame of 64 observations for 55 variables. I don't want any of the observation to be in more than one sub data-frame.
This code work for one sample
d1 <- d[sample(nrow(d),64,replace=F),]
How can I repeat this treatment ten times ?
This one give me a data-frame of 10 variables (each one is one sample...)
d1 <- replicate(10,sample(nrow(d),64,replace = F))}
Can anyone help me?
Upvotes: 0
Views: 158
Reputation: 13056
Here's a solution that returns the result in a list of data.frames:
d <- data.frame(A=1:640, B=sample(LETTERS, 640, replace=TRUE)) # an exemplary data.frame
idx <- sample(rep(1:10, length.out=nrow(d)))
res <- split(d, idx)
res[[1]] # first data frame
res[[10]] # last data frame
The only tricky part involves creating idx
. idx[i]
identifies the resulting data.frame, idx[i]
in {1,...,10}, in which the i
th row of d
will occur. Such an approach assures us that no row will be put into more than 1 data.frame.
Also, note that sample
returns a random permutation of (1,2,...,10,1,2,...,10).
Another approach is to use:
apply(matrix(sample(nrow(d)), ncol=10), 2, function(idx) d[idx,])
Upvotes: 1