Reputation: 1961
I have some doubts about the leap years, how can I be sure that by using a formula like this
add.years= function(x,y){
if(!isTRUE(all.equal(y,round(y)))) stop("Argument \"y\" must be an integer.\n")
x <- as.POSIXlt(x)
x$year <- x$year+y
as.Date(x)
}
it will take into account leap years, when adding for example 100 years to my observation dataset? How can I control this?
I have a time series dataset with 50 years of observations:
date obs
1995-01-01 1.0
1995-01-02 2.0
1995-01-03 2.5
...
2045-12-30 0.2
2045-12-31 0.1
dataset+100 years
date obs
2095-01-01 1.0
2095-01-02 2.0
2095-01-03 2.5
...
2145-12-30 0.2
2145-12-31 0.1
After a basic check, I've noticed that the number of rows is the same for both original and 100 years after dataset. I am not sure if what was before the 29th Februray in a leap year will be now the obs value for the 1st of March in a non-leap year, etc.
I can check leap years using from the chron library the function leap.year, however I would like to know if there is a simpler way to do this, to be sure that rows with pass days of 29th february that do not exist 100 years after will be deleted, and new days of 29th February are added with NA values.
Upvotes: 20
Views: 14450
Reputation: 219
I tried these three way on webr and got their results:
(as.Date("2020-02-29") + lubridate::years(1)) # NA
stats::update(as.Date("2020-02-29"), year = lubridate::year(as.Date("2020-02-29")) + 1) # "2021-03-01"
lubridate::`%m+%`(as.Date("2020-02-29"), lubridate::years(1)) # "2021-02-28"
You can choose one you like.
About operator %m+%
and %m-%
, see the document or ops-m+.r
at source code
Upvotes: 0
Reputation: 6363
Following the suggestion of DarkDust and Dirk Eddelbuettel, you can easily roll your own leap_year
function:
leap_year <- function(year) {
return(ifelse((year %%4 == 0 & year %%100 != 0) | year %%400 == 0, TRUE, FALSE))
}
and apply it to vector data:
years = 2000:2050
years[leap_year(years)]
[1] 2000 2004 2008 2012 2016 2020 2024 2028 2032 2036 2040 2044 2048
Upvotes: 5
Reputation: 66844
Your suspicions are indeed correct:
x <- as.POSIXlt("2000-02-29")
y <- x
y$year <- y$year+100
y
#[1] "2100-03-01"
The strange thing is that other parts of y
remain unchanged so you can't use these for comparison:
y$mday
#[1] 29
y$mon
#[1] 1
But you can use strftime
:
strftime(x,"%d")
#[1] "29"
strftime(y,"%d")
#[1] "01"
So how about:
add.years <- function(x,y){
if(!isTRUE(all.equal(y,round(y)))) stop("Argument \"y\" must be an integer.\n")
x.out <- as.POSIXlt(x)
x.out$year <- x.out$year+y
ifelse(strftime(x,"%d")==strftime(x.out,"%d"),as.Date(x.out),NA)
}
You can then subset your data using [
and is.na
to get rid of the otherwise duplicate 1st March dates. Though as these dates seem to be consecutive, you might want to consider a solution that uses seq.Date
and avoid dropping data.
Upvotes: 1
Reputation: 121127
You can check if a year is a leap year with leap_year
from lubridate
.
years <- 1895:2005
years[leap_year(years)]
This package will also handle not generating impossible 29ths of February.
ymd("2000-2-29") + years(1) # NA
ymd("2000-2-29") %m+% years(1) # "2001-02-28"
The %m+%
"add months" operator, as mentioned by @VitoshKa, rolls the date back to the end of the previous month if the actual day doesn't exist.
Upvotes: 19
Reputation: 92384
A year is a leap year if:
That is why 2000 was a leap year (although it's divisible by 100, it's also divisible by 400).
But generally, if you have a library that can take of date/time calculations then use it. It's very complicated to do these calculations and easy to do wrong, especially with ancient dates (calendar reforms) and timezones involved.
Upvotes: 4