Reputation: 375
I am trying to add a binary column indicator when one columns (dt_3) date is between the dates in two other columns (dt_1 and dt_2). I have this working on a small sample of my data however on my larger set the date column (dt_3) I am looking to compare to the others has a lot of NA's. Which is throwing the error Error: Expecting a single value:
. What is the best way to only check if non-NA values are between the two columns.
Here is an example of my data:
dt_1 dt_2 dt_3
2019-7-10 2019-8-21 2020-2-01
2019-8-22 2019-10-11 2019-9-01
2019-2-09 2019-3-02 NA
My current code:
dates %>%
mutate(between = ifelse(between(dt_3, dt_1, dt_2), 1, 0))
Expected Output:
dt_1 dt_2 dt_3 between
2019-7-10 2019-8-21 2020-2-01 0
2019-8-22 2019-10-11 2019-9-01 1
2019-2-09 2019-3-02 NA 0
Upvotes: 1
Views: 428
Reputation: 887153
Instead of the the between
an option is the comparison operators (>=
, <=
) and then replace the NA
with 0
library(dplyr)
library(lubridate)
library(tidyr)
dates %>%
mutate(across(everything(), ymd)) %>%
mutate(between = mutate(between = replace_na(dt_3 >= dt_1 & dt_3 <= dt_2, 0))
With between
, the left
and right
are not vectorized i.e it takes a single value. One option is rowwise
dates %>%
mutate(across(everything(), ymd)) %>%
rowwise %>%
mutate(between = replace_na(between(dt_3, dt_1, dt_2), 0))
# A tibble: 3 x 4
# Rowwise:
# dt_1 dt_2 dt_3 between
# <date> <date> <date> <dbl>
#1 2019-07-10 2019-08-21 2020-02-01 0
#2 2019-08-22 2019-10-11 2019-09-01 1
#3 2019-02-09 2019-03-02 NA 0
dates <- structure(list(dt_1 = c("2019-7-10", "2019-8-22", "2019-2-09"
), dt_2 = c("2019-8-21", "2019-10-11", "2019-3-02"), dt_3 = c("2020-2-01",
"2019-9-01", NA)), class = "data.frame", row.names = c(NA, -3L
))
Upvotes: 1