OTA
OTA

Reputation: 279

In a dataframe, create a numeric column based on two columns of dates

I have a df with two columns of dates. I want to create a numeric third column based on values in the two date columns. My actual dataframe has 100,000 rows.

Here are my conditions:
1) A can be any date, and B is any day in September 2019. Value in D is 1.
2) A is NA. B is NA. D is NA.
3) A is any day in September 2019. B is NA. D is 0.
4) A is any day in September 2019. B is any day in October 2019. D is 0.
5) Anything that does not fit in conditions 1-4, D = 2.
#What I have
A <- as.Date(c("2019-01-01", NA, "2019-09-01", "2019-09-02"))
B <- as.Date(c("2019-09-01", NA, NA, "2019-10-25"))
df <- data.frame(A,B)

#What I want
A <- as.Date(c("2019-01-01", NA, "2019-09-01", "2019-09-02"))
B <- as.Date(c("2019-09-01", NA, NA, "2019-10-25"))
D <- as.numeric(c(1, NA, 0, 0))
df2 <- data.frame(A,B,D)

Upvotes: 0

Views: 126

Answers (1)

fmarm
fmarm

Reputation: 4284

You can use case_when to avoid nested ifelsestatements and month and yearfunctions from the lubridatepackage

library(tidyverse)
library(lubridate)
df %>% mutate(D= case_when(!is.na(A) & month(B)==9 & year(B)==2019 ~ 1, # cond1
                           is.na(A) & is.na(B) ~ NA_real_, # cond 2
                           is.na(B) & month(A)==9 & year(A)==2019 ~ 0, # cond 3
                           month(A)==9 & year(A)==2019 & month(B)==10 & year(B)==2019 ~ 0, # cond 4
                           TRUE ~ 2 )) # all other cases 

Upvotes: 3

Related Questions