Sample by groupy with a condition (r)

Question

I need to randomly select a diary for each individual (id) but only for those who filled more than one.

Let us suppose my data look like this

dta = rbind(c(1, 1, 'a'), 
      c(1, 2, 'a'), 
      c(1, 3, 'b'), 
      c(2, 1, 'a'), 
      c(3, 1, 'b'), 
      c(3, 2, 'a'), 
      c(3, 3, 'c'))

colnames(dta) <- c('id', 'DiaryNumber', 'type')
dta = as.data.frame(dta)
dta

  id     DiaryNumber type
  1             1    a
  1             2    a
  1             3    b
  2             1    a
  3             1    b
  3             2    a
  3             3    c

For example, id 1 filled 3 diaries. What I need is to randomly select one of the 3 diaries. Id 2 only filled one diary, so I do not need to do anything with it.

I have no idea how I could do that. Any ideas ?

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer

You can use sample_n:

library(dplyr)
dta %>% group_by(id) %>% sample_n(1)
## Source: local data frame [3 x 3]
## Groups: id
## 
##   id DiaryNumber type
## 1  1           2    a
## 2  2           1    a
## 3  3           1    b

Sample by groupy with a condition (r)

Answers (2)

Related Questions