Andrii
Andrii

Reputation: 3043

How to check that date exist without broken data pipeline in R?

I have a data pipeline in R, and stuck with the error of converting a string into date for non-existed date in a future

s_date <- "2021-02-29"    
as.Date(s_date, origin = "1970-01-01")

This code generates error "Error in charToDate(x) : character string is not in a standard unambiguous format"

Here is a more detailed code from pipeline:

# Fixed month and day
date_month <- "2"
date_day <- "29"

# Loop by years
for(date_year in c("2020", "2021")) {

   s_date <- paste0(date_year, "-", date_month, "-", date_day)

   # BUG
   date_selected <- as.Date(s_date, origin = "1970-01-01")

}

How it's possible to process this bug without a broken pipeline. For example, by adding the next valid date "2021-03-01".

Thanks!

Upvotes: 0

Views: 66

Answers (2)

jay.sf
jay.sf

Reputation: 73397

You may want to try ISOdate which yields NA rather than an error.

as.Date(do.call(ISOdate, as.list(el(strsplit("2020-02-29" , "-")))))
# [1] "2020-02-29"
as.Date(do.call(ISOdate, as.list(el(strsplit("2021-02-29" , "-")))))
# [1] NA

Your loop could look something like this:

date_month <- "2"
date_day <- "29"
date_year <- c("2020", "2021")
r <- `class<-`(integer(), "Date")

for (i in seq(date_year)) {
  d <- as.Date(ISOdate(date_year[i], date_month, date_day))
  d <- if (is.na(d)) {
    as.Date(ISOdate(date_year[i], as.numeric(date_month) + 1, 1))
  } else d
  r[i] <- d
}
r
# [1] "2020-02-29" "2021-03-01"

Upvotes: 2

Rui Barradas
Rui Barradas

Reputation: 76615

This solution uses package lubridate. The code first checks if the date in the for loop is valid and if not calls auxiliary function nextDate.

nextDate <- function(y, m, d){
  y <- as.integer(y)
  m <- as.integer(m)
  d <- as.integer(d)
  days <- lubridate::days_in_month(paste(y, m, 1, sep = "-"))
  as.Date(paste(y, m, min(days, d), sep = "-")) + 1L
}

date_month <- "2"
date_day <- "29"

# Loop by years
for(date_year in c("2020", "2021")) {
  s_date <- paste0(date_year, "-", date_month, "-", date_day)
  # 
  repeat{
    date_selected <- tryCatch(as.Date(s_date, origin = "1970-01-01"),
                              error = function(e) e
    )
    if(inherits(date_selected, "error")){
      s_date <- nextDate(date_year, date_month, date_day)
    }else break
  }
}

See the result.

s_date
#[1] "2021-03-01"

Upvotes: 2

Related Questions