Reputation: 945
Assume a date specified as three integers: year, month, day
The year is a 4 digit integer (such as 2020), the month ranges over 1-12, the day over 1-31.
I'm looking for a simple function (call it checkdate) that can check whether a date is valid, returning TRUE if valid and FALSE if not valid.
For example, checkdate(2008, 2, 29) would return TRUE because 2008 was a leap year.
On the other hand checkdate(2009, 2, 29) would return FALSE because 2009 was not a leap year.
checkdate(2009, 6, 31) would return FALSE because June has only 30 days.
Etc.
UPDATE
Based on Dirk's answer, below, here is a function that does what I asked:
checkdate = function(y, m, d) {
#y: A year, not abbreviated to 2 digits.
#m: An integer in the range 1-12.
#d: An integer in the range 1-31.
#Convert to an R Date object.
#If the date is not valid, NA is returned.
dt = as.Date(paste(y, m, d, sep='-'), optional=TRUE)
ifelse(is.na(dt), FALSE, TRUE)
}
Upvotes: 0
Views: 1354
Reputation: 150
Here is a safer function:
checkdate <- function(y, m, d, min.year = NA, max.year = NA, recycle = TRUE) {
if (!recycle){
y_length <- length(y)
m_length <- length(m)
d_length <- length(d)
if (y_length != m_length | d_length != m_length ){
stop("The y, m and d vectors provided do not have the same length.")
}
}
dates <- paste(y, m, d, sep = "-")
# Accepts numbers and characters, but explicitly check conversion
!is.na(as.numeric(y)) &
# These 2 lines of code are optional but useful (with min.year=1900, reject "23" as we think "2023" is meant)
(is.na(min.year) | as.numeric(y) >= min.year ) &
(is.na(max.year) | as.numeric(y) <= max.year ) &
!is.na(as.numeric(m)) & as.numeric(m) > 0 & as.numeric(m) < 13 &
!is.na(as.numeric(d)) & as.numeric(d) > 0 & as.numeric(d) < 32 &
!is.na(as.Date(dates,
format = "%Y-%m-%d",
optional = TRUE # indicating to return NA (instead of signalling an error)
)
)
}
# Possible uses:
checkdate(0:40, 0:40, 0:40)
checkdate(0:40, 0:40, 0:40, min.year = 2000)
checkdate("2023", 0:40, 0:40, min.year = 2000)
checkdate("2023", 0:40, 0:40, recycle = F)
This is much safer than the other answers. It will work with 3 vectors y, m, d (contrary to Ronak's answer); it recycles them if needed (we can check vector lengths with recycle=F to prevent this). It accepts strings and numbers. It will not accept "10-11-12" or "10-11-123456789". Note that, a bit surprisingly:
as.Date (paste(10, 11, 2023, sep = '-'), format = "%Y-%m-%d" )
[1] "10-11-20" # is "a valid date" (!)
But this is not too surprising: as.Date() was not designed to be a validation function but a function converting valid inputs. We need to be more careful for validation.
The min.year, max.year options and related lines of code are optional but are, in my view, useful in some contexts; they define a valid range for the year. With min.year=1900, we reject "23" as we think "2023" was meant.
Upvotes: 0
Reputation: 5798
Base R using @Ronak Shah's logic:
checkdate <- function(y, m, d) {
tryCatch(inherits(as.Date(paste(y, m, d, sep = '-')), "Date"),
error = function(e) return(FALSE))
}
checkdate(2015, 12, 31)
Upvotes: 0
Reputation: 389175
Try to convert the inputs to date if it fails return FALSE
.
checkdate <- function(y, m, d) {
tryCatch(lubridate::is.Date(as.Date(paste(y, m, d, sep = '-'))),
error = function(e) return(FALSE))
}
checkdate(2009, 6, 31)
#[1] FALSE
checkdate(2009, 2, 29)
#[1] FALSE
checkdate(2008, 2, 29)
#[1] TRUE
Upvotes: 3
Reputation: 368439
Sure. Just try to parse it:
R> days <- 28:31
R> dates <- paste0("2020-02-", days)
R> as.Date(dates)
[1] "2020-02-28" "2020-02-29" NA NA
R>
This shows that in 2020, Feb 28 and 29 existed (leap year) but not 30 and 31.
From you three vectors you could use sprintf("%4d-%02d-%02", y, m, d)
to create a vector of text inputs to parse.
Upvotes: 0