CodeMonkey
CodeMonkey

Reputation: 1

How to mutate new columns in R based on earliest and latest dates for other variables

In a dataset where each patient had multiple test administrations and a score on each test date, I have to identify the earliest & latest test dates, then subtract the difference of the scores of those dates. I think I've identified the first & last dates through dplyr, creating new columns for those:

SplitDates <- SortedDates %>% 
  group_by(PatientID) %>% 
  mutate(EarliestTestDate = min(AdministrationDate), 
         LatestTestDate = max(AdministrationDate)) %>% 
  arrange(desc(PatientID))

Score column is TotalScore

Now how do I extract the scores from these 2 dates (for each patient) to create new columns of earliest & latest scores? Haven't been able to figure out a mutate with case_when or if_else to create a score based on a record with a certain date.

Upvotes: 0

Views: 657

Answers (1)

GregOliveira
GregOliveira

Reputation: 151

Have you tried to use one combine verb, like left_join, for example?

SplitDates <- SortedDates %>% 
    group_by(PatientID) %>% 
    mutate(EarliestTestDate = min(AdministrationDate), 
        LatestTestDate = max(AdministrationDate)) %>% 
    ungroup() %>%
    left_join(SortedDates,
        by = c(“PatientID” = “PatientID”, “AdministrationDate” = “EarliestTestDate”)) %>% # picking the score of EarliestTestDate
    left_join(SortedDates,
        by = c(“PatientID” = “PatientID”, “AdministrationDate” = “LatestTestDate”)) %>% # picking the score of EarliestTestDate
    arrange(desc(PatientID)) # now you can make the mutante task that you need.

I suggest to you see the dplyr cheatsheet.

Upvotes: 0

Related Questions