Reputation: 3441
I an trying to take a random sample from each level of a factor. There are a different number of observations for each factor level. For each level I want to create a sample with half as many observations.
library(dplyr)
dat <- data.frame(ID = rep(c("AAA", "AAA","AAA","BBB","BBB","CCC"), length = 100),
Value = sample(1:100, replace = T))
Using the data above, it seems like something like the following should nearly work, but the error (Error in n() : This function should not be called directly) suggests I am incorrectly using the n() function.
Samp <- dat %>% group_by(ID) %>% sample_n(size = n()/2 )
Thanks in advance.
Upvotes: 2
Views: 1185
Reputation: 2489
Try sample_frac()
:
library(dplyr)
Samp <- dat %>% group_by(ID) %>% sample_frac(.5)
Upvotes: 6