Kreitz Gigs
Kreitz Gigs

Reputation: 379

Code change in status over time in R dplyr

structure(list(ID = c(16537L, 16537L, 16537L, 16537L, 16537L, 
1248568L, 1248568L, 1472316L, 1472316L, 1472316L, 1472316L, 1530500L, 
1530500L, 1530500L, 1804386L, 1804386L), DATE = structure(c(18053, 
18237, 18308, 18470, 18686, 18585, 18683, 18586, 18619, 18647, 
18701, 18611, 18702, 18737, 18709, 18676), class = "Date"), status = structure(c(1L, 
1L, 1L, 1L, 3L, 1L, 3L, 1L, 1L, 1L, 3L, 1L, 3L, 1L, 1L, 3L), .Label = c("N", 
"OH", "SHr", "SHo"), class = "factor"), count = c(5L, 
5L, 5L, 5L, 5L, 2L, 2L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 2L, 2L)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -16L), groups = structure(list(
    ID = c(16537L, 1248568L, 1472316L, 1530500L, 1804386L), .rows = structure(list(
        1:5, 6:7, 8:11, 12:14, 15:16), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L), .drop = TRUE))

I am trying to create a categorical variable indicating whether an individual changes status over time. I have used mutate with any() but this doesn't capture the process of change. For example, I would like the new column ("status_change") to include the category "N to SHr" for ID == "16537", "N to SHr to N" for ID == "1530500", and "SHr to N" for ID == "1804386".

Upvotes: 0

Views: 57

Answers (1)

Jon Spring
Jon Spring

Reputation: 66775

I'd approach this by first summarizing each ID, limiting it to the changing rows and pasting the various values together, then join to the original full data.

library(dplyr)
df %>%   # note, data in OP already grouped by ID
  left_join(
    df %>%
      #group_by(ID) %>%  # need this if data not grouped by ID already
      filter(status != lag(status, default = "")) %>%
      summarize(status_change = paste(status, collapse = " to "))
  )

Result

Joining, by = "ID"
# A tibble: 16 x 5
# Groups:   ID [5]
        ID DATE       status count status_change
     <int> <date>     <fct>  <int> <chr>        
 1   16537 2019-06-06 N          5 N to SHr     
 2   16537 2019-12-07 N          5 N to SHr     
 3   16537 2020-02-16 N          5 N to SHr     
 4   16537 2020-07-27 N          5 N to SHr     
 5   16537 2021-02-28 SHr        5 N to SHr     
 6 1248568 2020-11-19 N          2 N to SHr     
 7 1248568 2021-02-25 SHr        2 N to SHr     
 8 1472316 2020-11-20 N          4 N to SHr     
 9 1472316 2020-12-23 N          4 N to SHr     
10 1472316 2021-01-20 N          4 N to SHr     
11 1472316 2021-03-15 SHr        4 N to SHr     
12 1530500 2020-12-15 N          3 N to SHr to N
13 1530500 2021-03-16 SHr        3 N to SHr to N
14 1530500 2021-04-20 N          3 N to SHr to N
15 1804386 2021-03-23 N          2 N to SHr     
16 1804386 2021-02-18 SHr        2 N to SHr     

Upvotes: 1

Related Questions