Demetria Flowers
Demetria Flowers

Reputation: 3

How to calculate time difference between different rows based off criteria/conditions in R

I have the following data frame and I am looking to create a variable that calculates the amount of days between the current ADMITDATE from the previous DISCHARGE DATE only for duplicates or when the IDs are equal

ID  ADMITDATE   DISCHARGE DATE  
1   5/24/2020   5/27/2020   
1   5/30/2020   6/15/2020   
2   3/9/2010    4/5/2020    
3   2/7/2020    3/7/2020    
4   5/15/2020   5/16/2020   
4   5/20/2020   5/25/2020   
4   5/30/2020   6/15/2020   
5   7/8/2020    7/20/2020   
6   1/20/2020   1/22/2020   
6   1/28/2020   2/5/2020    

So I want the output to look like below.

ID    ADMITDATE     DISCHARGE DATE  DAYDIFF

1   5/24/2020   5/27/2020   NULL

1   5/30/2020   6/15/2020   3

2   3/9/2010    4/5/2020    NULL

3   2/7/2020    3/7/2020    NULL

4   5/15/2020   5/16/2020   NULL

4   5/20/2020   5/25/2020   4

4   5/30/2020   6/15/2020   5

5   7/8/2020    7/20/2020   NULL

6   1/20/2020   1/22/2020   NULL

6   1/28/2020   2/5/2020    6

Any tips on how to write the code for this in R?

The code i currently have is

df$DAYDIFF <- [df$ID==lag(ID)] <- as.Date(as.character(test$ADMITDATE), format="%m/%d/%Y")-
  lag(as.Date(as.character(test$DISCHARGEDATE), format="%m/%d/%Y"))

Upvotes: 0

Views: 160

Answers (1)

stlba
stlba

Reputation: 767

Assuming your data is stored in a dataframe called df:

library(tidyverse)
library(lubridate)

df %>% 
     mutate(ADMITDATE = mdy(ADMITDATE),
            DISCHARGEDATE = mdy(DISCHARGEDATE)) %>% 
     group_by(ID) %>% 
     mutate(DAYDIFF = ADMITDATE - lag(DISCHARGEDATE, 1))

The mutate call converts your columns to date formats using lubridate to specify the format as month-day-year. We then group_by the ID values, before calculating the difference between the current ADMITDATEand the previous DISCHARGEDATE (using lag())

Upvotes: 1

Related Questions