Reputation: 33
I have a data frame with one column for year and another column in Julian day (1-366, 1-365 depending on the year). I wanted to know how can I effectively set the DOY_start column as.Date based on the year (to account for leap years).
I tried to use as.Date(), as.POSIXct(), lubridate::as_date()
But I have failed in all my trials. Below is an example of code with generated data that is really similar to my original one.
Thank you so much for any advice.
library(tibble)
Year <- 1980:2020
DOY_start <- as.integer(rnorm(length(Year), mean=91.1, sd=9.65))
var <- cbind(Year, DOY_start)
var <- as_tibble(var)
head(var)
#> # A tibble: 6 x 2
#> Year DOY_start
#> <int> <int>
#> 1 1980 98
#> 2 1981 89
#> 3 1982 79
#> 4 1983 97
#> 5 1984 81
#> 6 1985 80
var$DOY_start_date <- as.POSIXct(strptime(var$DOY_start, "%j"))
head(var)
#> # A tibble: 6 x 3
#> Year DOY_start DOY_start_date
#> <int> <int> <dttm>
#> 1 1980 98 2020-04-07 00:00:00
#> 2 1981 89 2020-03-29 00:00:00
#> 3 1982 79 2020-03-19 00:00:00
#> 4 1983 97 2020-04-06 00:00:00
#> 5 1984 81 2020-03-21 00:00:00
#> 6 1985 80 2020-03-20 00:00:00
Created on 2020-09-18 by the reprex package (v0.3.0)
Upvotes: 2
Views: 270
Reputation: 368261
That is interesting puzzle. We know that as.POSIXlt
contains the day of the year number and that some date libraries convert to it, but I could not immediately find a parser that dealt with it.
Then again, date arithmentic is all we need. We always get the date of January 1. And the desired date is then simply the Jan 1 plus the 'day-of-year' number minus 1.
Codeyearyearday <- function(yr, yd) {
base <- as.Date(paste0(yr, "-01-01")) # take Jan 1 of year
day <- base + yd - 1
}
set.seed(42) # make it reproducible
sample <- data.frame(year=1980:2020, doy=as.integer(rnorm(41,mean=91.1,sd=9.65)))
sample$date <- yearyearday(sample$year, sample$doy)
head(sample)
Output
R> yearyearday <- function(yr, yd) {
+ base <- as.Date(paste0(yr, "-01-01")) # take Jan 1 of year
+ day <- base + yd - 1
+ }
R>
R> set.seed(42) # make it reproducible
R> sample <- data.frame(year=1980:2020,
+ doy=as.integer(rnorm(41, mean=91.1, sd=9.65)))
R>
R> sample$date <- yearyearday(sample$year, sample$doy)
R>
R> head(sample)
year doy date
1 1980 104 1980-04-13
2 1981 85 1981-03-26
3 1982 94 1982-04-04
4 1983 97 1983-04-07
5 1984 95 1984-04-04
6 1985 90 1985-03-31
R>
As so often with date calculation, nothings besides base R is needed.
Upvotes: 3