Gago-Silva
Gago-Silva

Reputation: 1961

How to account for leap years?

I have some doubts about the leap years, how can I be sure that by using a formula like this

add.years= function(x,y){    
if(!isTRUE(all.equal(y,round(y)))) stop("Argument \"y\" must be an integer.\n")
x <- as.POSIXlt(x)
x$year <- x$year+y
as.Date(x)
}

it will take into account leap years, when adding for example 100 years to my observation dataset? How can I control this?

I have a time series dataset with 50 years of observations:

   date    obs
1995-01-01 1.0
1995-01-02 2.0
1995-01-03 2.5
...
2045-12-30 0.2
2045-12-31 0.1

dataset+100 years

   date    obs
2095-01-01 1.0
2095-01-02 2.0
2095-01-03 2.5
...
2145-12-30 0.2
2145-12-31 0.1

After a basic check, I've noticed that the number of rows is the same for both original and 100 years after dataset. I am not sure if what was before the 29th Februray in a leap year will be now the obs value for the 1st of March in a non-leap year, etc.

I can check leap years using from the chron library the function leap.year, however I would like to know if there is a simpler way to do this, to be sure that rows with pass days of 29th february that do not exist 100 years after will be deleted, and new days of 29th February are added with NA values.

Upvotes: 20

Views: 14450

Answers (5)

ypa y yhm
ypa y yhm

Reputation: 219

I tried these three way on webr and got their results:

(as.Date("2020-02-29") + lubridate::years(1)) # NA
stats::update(as.Date("2020-02-29"), year = lubridate::year(as.Date("2020-02-29")) + 1) # "2021-03-01"
lubridate::`%m+%`(as.Date("2020-02-29"), lubridate::years(1)) # "2021-02-28"

You can choose one you like.

About operator %m+% and %m-% , see the document or ops-m+.r at source code

Upvotes: 0

Adam Erickson
Adam Erickson

Reputation: 6363

Following the suggestion of DarkDust and Dirk Eddelbuettel, you can easily roll your own leap_year function:

leap_year <- function(year) {
  return(ifelse((year %%4 == 0 & year %%100 != 0) | year %%400 == 0, TRUE, FALSE))
}

and apply it to vector data:

years = 2000:2050
years[leap_year(years)]

[1] 2000 2004 2008 2012 2016 2020 2024 2028 2032 2036 2040 2044 2048

Upvotes: 5

James
James

Reputation: 66844

Your suspicions are indeed correct:

x <- as.POSIXlt("2000-02-29")
y <- x
y$year <- y$year+100
y
#[1] "2100-03-01"

The strange thing is that other parts of y remain unchanged so you can't use these for comparison:

y$mday
#[1] 29
y$mon
#[1] 1

But you can use strftime:

strftime(x,"%d")
#[1] "29"
strftime(y,"%d")
#[1] "01"

So how about:

add.years <- function(x,y){
   if(!isTRUE(all.equal(y,round(y)))) stop("Argument \"y\" must be an integer.\n")
   x.out <- as.POSIXlt(x)
   x.out$year <- x.out$year+y
   ifelse(strftime(x,"%d")==strftime(x.out,"%d"),as.Date(x.out),NA)
   } 

You can then subset your data using [ and is.na to get rid of the otherwise duplicate 1st March dates. Though as these dates seem to be consecutive, you might want to consider a solution that uses seq.Date and avoid dropping data.

Upvotes: 1

Richie Cotton
Richie Cotton

Reputation: 121127

You can check if a year is a leap year with leap_year from lubridate.

years <- 1895:2005
years[leap_year(years)]

This package will also handle not generating impossible 29ths of February.

ymd("2000-2-29") + years(1)    # NA
ymd("2000-2-29") %m+% years(1) # "2001-02-28"

The %m+% "add months" operator, as mentioned by @VitoshKa, rolls the date back to the end of the previous month if the actual day doesn't exist.

Upvotes: 19

DarkDust
DarkDust

Reputation: 92384

A year is a leap year if:

  • Is divisible by 4.
  • Not if it is divisible by 100.
  • But is if it is divisible by 400.

That is why 2000 was a leap year (although it's divisible by 100, it's also divisible by 400).

But generally, if you have a library that can take of date/time calculations then use it. It's very complicated to do these calculations and easy to do wrong, especially with ancient dates (calendar reforms) and timezones involved.

Upvotes: 4

Related Questions