Reputation: 757
I have a question on how to get a random sample but maintain multiple items that belong to the same group. What I'm really trying to do is do sampling, but each sample has to include every item.
Here is a method of sampling from mtcars. Using this, I get two random rows,
(sampled_df <- mtcars[sample(nrow(mtcars), 2), ])
I can take mtcars
and then number it as though there are groups. mtcars
has 32 observations. Here I'm saying that there are eight groups with four items each.
library(dplyr)
mtcars %>%
mutate(number = rep(1:8,each=4)) %>%
group_by(number) %>%
sample_n(2)
The last two lines of code isn't doing what I'm hoping it would. I'm trying to have eight lines as output: all four of the observations from two of the groups.
I'm really working with invoice data and I want to be able to make the data frame smaller while making sure that I'm keeping the basket sizes the same.
Upvotes: 1
Views: 86
Reputation: 11016
What you might want is:
mtcars %>%
mutate(number = rep(1:8,each=4)) %>%
filter(number %in% sample(1:8, 2))
Upvotes: 3