Reputation: 279
I have a df with two columns of dates. I want to create a numeric third column based on values in the two date columns. My actual dataframe has 100,000 rows.
Here are my conditions:
1) A can be any date, and B is any day in September 2019. Value in D is 1.
2) A is NA. B is NA. D is NA.
3) A is any day in September 2019. B is NA. D is 0.
4) A is any day in September 2019. B is any day in October 2019. D is 0.
5) Anything that does not fit in conditions 1-4, D = 2.
#What I have
A <- as.Date(c("2019-01-01", NA, "2019-09-01", "2019-09-02"))
B <- as.Date(c("2019-09-01", NA, NA, "2019-10-25"))
df <- data.frame(A,B)
#What I want
A <- as.Date(c("2019-01-01", NA, "2019-09-01", "2019-09-02"))
B <- as.Date(c("2019-09-01", NA, NA, "2019-10-25"))
D <- as.numeric(c(1, NA, 0, 0))
df2 <- data.frame(A,B,D)
Upvotes: 0
Views: 126
Reputation: 4284
You can use case_when
to avoid nested ifelse
statements and month
and year
functions from the lubridate
package
library(tidyverse)
library(lubridate)
df %>% mutate(D= case_when(!is.na(A) & month(B)==9 & year(B)==2019 ~ 1, # cond1
is.na(A) & is.na(B) ~ NA_real_, # cond 2
is.na(B) & month(A)==9 & year(A)==2019 ~ 0, # cond 3
month(A)==9 & year(A)==2019 & month(B)==10 & year(B)==2019 ~ 0, # cond 4
TRUE ~ 2 )) # all other cases
Upvotes: 3