Elsnflsdn
Elsnflsdn

Reputation: 1

Sampling different x and different sample size in R

Say I have a table like this:

Students Equipment #
A 101
A 102
A 103
B 104
B 105
B 106
B 107
B 108
C 109
C 110
C 111
C 112

I want to grab equipment # samples from each student in the data frame with varying sample sizes.

For example, I want 1 equipment # from student "A", 2 from student "B", and 3 from student "C". How can I achieve this in R?

This is the code that I have now, but I'm only getting 1 equipment # printed from each student.

students <- unique(df$`Students`)

sample_size <- c(1,2,3)

for (i in students){

  s <- sample(df[df$`Students` == i,]$`Equipment #`, size = sample_size, replace = FALSE)

  print(s)

}

Upvotes: 0

Views: 603

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389215

You can create a dataframe which has information students and the rows to be sampled. Join the data and use sample_n to sample those rows.

library(dplyr)

sample_data <- data.frame(Students = c('A', 'B', 'C'), nr = 1:3)

df %>%
  left_join(sample_data, by = 'Students') %>%
  group_by(Students) %>%
  sample_n(first(nr)) %>%
  ungroup() %>%
  select(-nr) -> s

s

#  Students Equipment
#  <chr>        <int>
#1 A              102
#2 B              108
#3 B              105
#4 C              110
#5 C              112
#6 C              111

Upvotes: 1

Dan Adams
Dan Adams

Reputation: 5254

You're close. You need to index the sample_size vector with the loop, otherwise it will just take the first item in the vector for each iteration.

library(dplyr)

# set up data
df <- data.frame(Students = c(rep("A", 3),
                              rep("B", 5),
                              rep("C", 4)),
                 Equipment_num = 101:112)

# create vector of students
students <- df %>% 
  pull(Students) %>% 
  unique()

# sample and print
for (i in seq_along(students)) {
  p <- df %>% 
    filter(Students == students[i]) %>% 
    slice_sample(n = i)
  
  print(p)
}
#>   Students Equipment_num
#> 1        A           102
#>   Students Equipment_num
#> 1        B           107
#> 2        B           105
#>   Students Equipment_num
#> 1        C           109
#> 2        C           110
#> 3        C           112

Created on 2021-08-06 by the reprex package (v2.0.0)

Actually this is a much more elegant and generalizable way to tackle this problem.

Upvotes: 0

Related Questions