mutate inside group_by, based on particular row inside group

Question

Let's say I have a dataframe of the start and end dates of people's procedures in person X procedure "long" format:

df <- data.frame(person.id = c(1,1,2,2,3,3),
             start.date = c("2015-01-01", "2015-01-05", "2016-05-06", "2015-04-01", "2015-07-01", "2015-01-06"),
             end.date = c("2015-01-30", "2015-02-05", "2016-06-23", "2015-05-30", "2015-08-10", "2015-02-05"),
             procedure = c("alpha", "beta", "alpha", "beta", "alpha", "beta"))

How would I create a variable at the person level, i.e. under a group_by(person.id), which represents the start date of their "alpha" procedure? I can think of some longer workarounds for this, but I'm wondering if there's an elegant way to do it inside a group_by and a mutate, like:

df %<>%
  group_by(person.id) %>%
  mutate(alpha.start.date = #??)

Thanks!

akrun · Accepted Answer

We can create the variable with mutate by getting the 'end.date' that corresponds to 'alpha' 'procedure'

library(dplyr)
df %>%
  group_by(person.id) %>% 
  mutate(alpha.start.date = end.date[procedure == "alpha"])
# A tibble: 6 x 5
# Groups:   person.id [3]
#  person.id start.date end.date   procedure alpha.start.date
#                                   
#1         1 2015-01-01 2015-01-30 alpha     2015-01-30      
#2         1 2015-01-05 2015-02-05 beta      2015-01-30      
#3         2 2016-05-06 2016-06-23 alpha     2016-06-23      
#4         2 2015-04-01 2015-05-30 beta      2016-06-23      
#5         3 2015-07-01 2015-08-10 alpha     2015-08-10      
#6         3 2015-01-06 2015-02-05 beta      2015-08-10

mutate inside group_by, based on particular row inside group

Answers (1)

Related Questions