Zhiqiang Wang
Zhiqiang Wang

Reputation: 6759

Randomly select rows in R using sample_n

df <- data.frame(
  id = c(1:12), 
  day = c(1, 1, 1,1, 2, 2,2, 2, 3,3,3,3), 
  endpoint = c(1, 1, 1,1, 2,2,2,2,1,1,1,1))  
df
#>    id day endpoint
#> 1   1   1        1
#> 2   2   1        1
#> 3   3   1        1
#> 4   4   1        1
#> 5   5   2        2
#> 6   6   2        2
#> 7   7   2        2
#> 8   8   2        2
#> 9   9   3        1
#> 10 10   3        1
#> 11 11   3        1
#> 12 12   3        1

In the above data, there some patients(id) reached the endpoint each day. I am trying to randomly select the endpoint number of patients with s = 1. For each day, ids on that day and previously days are eligible as long as not previously selected. The following code gets what I expected, but I have to manually enter day and endpoint values. Any suggestions on how to pick those values directly from the data would be appreciated.

library(dplyr)
df$s = 0 
df$s <-ifelse(df$id%in%sample_n(df[df$day<=1 & df$s==0, ], 1)$id, 1, df$s) 
df$s <-ifelse(df$id%in%sample_n(df[df$day<=2 & df$s==0, ], 2)$id, 1, df$s) 
df$s <-ifelse(df$id%in%sample_n(df[df$day<=3 & df$s==0, ], 1)$id, 1, df$s) 
df
#>    id day endpoint s pick_day 
#> 1   1   1        1 0 0
#> 2   2   1        1 1 2
#> 3   3   1        1 1 1
#> 4   4   1        1 1 3
#> 5   5   2        2 1 2
#> 6   6   2        2 0 0
#> 7   7   2        2 0 0
#> 8   8   2        2 0 0
#> 9   9   3        1 0 0
#> 10 10   3        1 0 0
#> 11 11   3        1 0 0
#> 12 12   3        1 0 0

EDIT

Is it possible to add a variable to show the day for which a row was picked, like the above variable pick_day? Thanks.

Upvotes: 1

Views: 85

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388807

A way in base R using for loop :

df$s = 0 
set.seed(123)

for (i in unique(df$day)) {
   temp <- subset(df, day <= i & s == 0)
   ids <- with(temp, sample(id, endpoint[day == i][1]))
   df$s[df$id %in% ids] <- 1
}

df

#   id day endpoint s
#1   1   1        1 0
#2   2   1        1 0
#3   3   1        1 1
#4   4   1        1 1
#5   5   2        2 1
#6   6   2        2 0
#7   7   2        2 0
#8   8   2        2 1
#9   9   3        1 0
#10 10   3        1 0
#11 11   3        1 0
#12 12   3        1 0

Upvotes: 2

Related Questions