Subset rows preceding a specific row value in grouped data using R

Question

Consider the following dataframe

df<-data.frame(group=c(1,1,1,2,2,2,3,3,3),
               status=c(NA,1,1,NA,NA,1,NA,1,NA),
               health=c(0,1,1,1,0,1,1,0,0))

For each group (i.e. first column), I'm looking for a way to subset the rows preceding the cells where 1 is first seen in the second column (labelled status). The expected output is

  group status health
1     1     NA      0
2     2     NA      0
3     3     NA      1

I've tried resolving this with "filter" and "slice" functions, but have not succeed in subsetting preceding rows. Any help is greatly appreciated.

Roman · Accepted Answer

one solution is a tidyverse

df %>% 
  group_by(group) %>% 
  mutate(gr=which(status==1)[1]-1) %>% 
  slice(unique(gr)) %>% 
  select(-gr)
# A tibble: 3 x 3
# Groups:   group [3]
  group status health
      
1     1     NA      0
2     2     NA      0
3     3     NA      1

or

df %>% 
  group_by(group) %>% 
  filter(row_number() == which(status==1)[1]-1)

or

df %>% 
  group_by(group) %>% 
  slice(which(lead(status==1))[1])

Subset rows preceding a specific row value in grouped data using R

Answers (1)

Related Questions