Anneke
Anneke

Reputation: 73

R - Creating a new variable based on multiple observations

My dataset represents patients which have been treated multiple times. The dataset is in a long format, patients either get treatment A, C or S or a combination. A and C are never combined.

Simply put, the data looks something like this:

df <- tibble(PatientID = c(1,1,1,2,2,3,3,3,3,4,4,5,5,5,6,6),
             treatment = c("A", "A", "S", "C", "S", "S", "C", "C", NA, "C", NA, NA, "S", "A", "S", NA)

I would like to creat a new variable based on if any patient had treatment A or C or neither, so the end result looking something like:

df <- tibble(PatientID = c(1,1,1,2,2,3,3,3,3,4,4,5,5,5,6,6),
             treatment = c("A", "A", "S", "C", "S", "S", "C", "C", NA, "C", NA, NA, "S", "A", "S", "S"),
             group = c("A", "A", "A", "C", "C", "C", "C", "C", "C", "C", "C", "A", "A", "A", "S", "S"))

How can I best approach this? I'm struggling with how to deal with multiple observations per ID.

Thank you!

Upvotes: 1

Views: 66

Answers (1)

Noah
Noah

Reputation: 440

You can use group_by() in combination with mutate() and case_when() to achieve this:

library(tidyverse)

df <- tibble(PatientID = c(1,1,1,2,2,3,3,3,3,4,4,5,5,5,6,6),
             treatment = c("A", "A", "S", "C", "S", "S", "C", "C", NA, "C", NA, NA, "S", "A", "S", NA))

df %>%
  group_by(PatientID) %>%
  mutate(groups = case_when("A" %in% treatment ~ "A",
                            "C" %in% treatment ~ "C",
                            TRUE ~ "S"))
#> # A tibble: 16 × 3
#> # Groups:   PatientID [6]
#>    PatientID treatment groups
#>        <dbl> <chr>     <chr> 
#>  1         1 A         A     
#>  2         1 A         A     
#>  3         1 S         A     
#>  4         2 C         C     
#>  5         2 S         C     
#>  6         3 S         C     
#>  7         3 C         C     
#>  8         3 C         C     
#>  9         3 <NA>      C     
#> 10         4 C         C     
#> 11         4 <NA>      C     
#> 12         5 <NA>      A     
#> 13         5 S         A     
#> 14         5 A         A     
#> 15         6 S         S     
#> 16         6 <NA>      S

Created on 2022-08-18 with reprex v2.0.2

Upvotes: 0

Related Questions