R: Representative random sampling for 150 values from categories with different group size

Question

I face the problem that I want to have 150 randomly drawn samples from a dataset based on two categories "site" and "species". So, ideally, we have an outcome of 30 samples per site where each species is more or less equally distributed.

Reproducible example:

df <- data.frame(site = rep(c("A", "B", "C", "D", "E"), each = 10), species = c("s1", rep("s2", each = 3), rep("s3", each = 16), rep("s4", each = 13), rep("s5", each = 17)), individual = c(1, 1:3, 1:16, 1:13, 1:17) )

I think using the dplyr function group_by(site, species) and slice_sample() are a good approach which would however sample a certain amount per group and not 150 in total.. Another problem I have now is that slice_sample needs at least the n-amount of samples in each group to work. This is not always given. So, is there a possibility of sampling 150 in total and whenever the desired amount to sample per group is not provided, then just sample others for compensation?

Thanks!

R: Representative random sampling for 150 values from categories with different group size

Answers (1)

Related Questions