Calculate lag string in group

Question

I have a toy electoral db and need to calculate incumbency but cannot using grouped values and dplyr::lag

race <- data.frame(city=rep(1,6),
                   date=c(3,3,2,2,1,1),
                   candidate=c("A","B","A","C","D","E"),
                   winner=rep(c(1,0),3))

I made a convoluted attempt that is not ideal (as I have to merge in non-winners:

race %>%
  group_by(city,date) %>% 
  mutate(win_candidate=candidate[winner==1]) %>% 
  filter(winner==1) %>% 
  ungroup() %>%
  group_by(city) %>% 
  mutate(incumbent=lead(win_candidate, n=1, default = NA_character_),
         incumbent=ifelse(candidate==incumbent,1,0)) %>%
  select(-win_candidate)

DaveArmstrong · Accepted Answer

How about this:

r <- race %>%
  group_by(city,date) %>% 
  summarise(win_candidate = candidate[which(winner== 1)]) %>% 
  ungroup %>% 
  group_by(city) %>% 
  arrange(date) %>% 
  mutate(prev_win_candidate = lag(win_candidate)) %>% 
  left_join(race, .) %>%
  mutate(incumbent = as.numeric(candidate == prev_win_candidate), 
         incumbent = case_when(
           is.na(incumbent) ~ 0, 
           TRUE ~ incumbent)) %>% 
  select(-c(win_candidate, prev_win_candidate))
  
#   city date candidate winner incumbent
# 1    1    3         A      1         1
# 2    1    3         B      0         0
# 3    1    2         A      1         0
# 4    1    2         C      0         0
# 5    1    1         D      1         0
# 6    1    1         E      0         0

Calculate lag string in group

Answers (2)

Related Questions