Reputation: 5
I have some hospital data that looks like this:
patient_id | treatment_1 | treatment_2 | date_dummy |
---|---|---|---|
3 | 2012-01-04 | 2012-03-27 | 0 |
3 | 2021-07-11 | 2012-10-20 | 0 |
3 | 2013-04-04 | 2013-06-22 | 0 |
12 | 2012-12-09 | 2013-11-09 | 0 |
18 | 2012-02-25 | 2012-03-26 | 0 |
25 | 2012-10-06 | 2013-12-29 | 1 |
25 | 2013-04-06 | 2013-07-07 | 0 |
I need to re-create the date_dummy
variable that equals 1 if the patient was treated again between the two treatment dates, and 0 otherwise. Patient 25 is the best example of this.
If anyone knows a command to do this using the dplyr package in R that would awesome. Thanks for any help.
Upvotes: 0
Views: 1302
Reputation: 388862
Building upon @Rex Parsons answer you can do :
library(dplyr)
library(lubridate)
library(purrr)
df %>%
mutate(across(starts_with('treatment'), as.Date),
interval = interval(treatment_1, treatment_2)) %>%
group_by(patient_id) %>%
mutate(date_dummy = map_int(row_number(),
~as.integer(any(interval[-.x] %within% interval[.x])))) %>%
ungroup
# patient_id treatment_1 treatment_2 date_dummy interval
# <int> <date> <date> <int> <Interval>
#1 3 2012-01-04 2012-03-27 0 2012-01-04 UTC--2012-03-27 UTC
#2 3 2012-07-11 2012-10-20 0 2012-07-11 UTC--2012-10-20 UTC
#3 3 2013-04-04 2013-06-22 0 2013-04-04 UTC--2013-06-22 UTC
#4 12 2012-12-09 2013-11-09 0 2012-12-09 UTC--2013-11-09 UTC
#5 18 2012-02-25 2012-03-26 0 2012-02-25 UTC--2012-03-26 UTC
#6 25 2012-10-06 2013-12-29 1 2012-10-06 UTC--2013-12-29 UTC
#7 25 2013-04-06 2013-07-07 0 2013-04-06 UTC--2013-07-07 UTC
You may want to remove interval
column from the final output if you don't need it.
Upvotes: 0
Reputation: 339
to check whether a date is within the range of two other dates, you can use:
library(lubridate)
x %within% interval(ymd(20161001), ymd(20170930))
This checks whether x
is between October 1st 2016 and Sep 30th, 2017.
I'm not sure what your date for 'treated again' within the two treatment dates is called but something like this may work:
data %>%
mutate(date_dummy = ifelse(treated_again_date %within% interval(treatment_1, treatment_2), 1, 0)
Upvotes: 2