Aryh
Aryh

Reputation: 585

How to combine two DATES variables in R data frame

I have a data frame as following

Sample_ID<-c("a1","a2","a3","a4","a5","a6")
Heart_attack<-c("2010/04/13", "2008/07/30", "2009/03/06", "2008/08/22", "2009/06/24", "2008/08/26")
Stroke<-c("2007/05/17", "2009/05/16", "2007/05/16", "2007/05/16","2007/05/16", "2010/05/16")
DF<-data.frame(Sample_ID,Heart_attack,Stroke)

I need to make TWO COLUMNS. One column in this dataframe called CVD_date. All i want is that among the Heart_attack and Stroke, the event occurred earlier that "date" should be included in this variable. For example i am looking for following output. The second column CVD should show 1 if the event reported in CVD_date is of Heart_attack and 2 otherwise. For example i am looking for following output.

Sample ID  Heart_attack  Stroke        CVD_date         CVD
 a1         2010/04/13   2007/05/17    2007/05/17        2
 a2         2008/07/30   2009/05/16    2008/07/30        1
 a3         2009/03/06   2007/05/16    2007/05/16        1

How to do this in R?

Upvotes: 0

Views: 319

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388817

You can use pmin to get minimum between Heart_attack and Stroke date. For CVD we compare both the dates, convert the logical values to integer and add 1 which will give 1 if Stroke is greater than Heart_attack date and 2 otherwise..

library(dplyr)

DF %>%
  mutate(across(-1, lubridate::ymd), 
         CVD_date = pmin(Heart_attack, Stroke), 
         CVD = as.integer(Heart_attack > Stroke) + 1)

#  Sample_ID Heart_attack     Stroke   CVD_date CVD
#1        a1   2010-04-13 2007-05-17 2007-05-17   2
#2        a2   2008-07-30 2009-05-16 2008-07-30   1
#3        a3   2009-03-06 2007-05-16 2007-05-16   2
#4        a4   2008-08-22 2007-05-16 2007-05-16   2
#5        a5   2009-06-24 2007-05-16 2007-05-16   2
#6        a6   2008-08-26 2010-05-16 2008-08-26   1

In the older version you can do :

DF %>%
  mutate_at(-1, lubridate::ymd) %>%
  mutate(CVD_date = pmin(Heart_attack, Stroke), 
         CVD = as.integer(Heart_attack > Stroke) + 1)

Upvotes: 1

Related Questions