Reputation: 381
I start from the following question to seek generalization properties: if statement with dates in R
df <- data.frame(date = as.Date(c("16.04.2015", "04.08.2014", "11.09.2013",
"20.11.2015", "04.04.2014"), '%d.%m.%Y'))
I want to identify all the dates between 07-15 (%m-%d) and 12-31, e.g.
> date value
> 16.04.2015 0
> 04.08.2014 1
> 11.09.2013 1
> 20.11.2015 1
> 04.04.2014 0
My solution is based on a solution by @rawr posted in the above question:
Function for the interval:
`%between%` <- function(x, interval) x >= interval[1] & x <= interval[2]
vector with all the possible beginning:
begi <- as.Date(sprintf('%s-07-15',1993:2018))
# Vector with all the possible intervals
dates <- as.Date(c(sprintf('%s-07-15',1993:2018), sprintf('%s-12-31',1993:2018)))
Loop using the function
df$value <- NA
for (i in length(begi)) {
ind<-which(format(df$date,"%Y") == format(begi[i], "%Y"))
df$value[ind] <- 1*(df$date[ind] %between% as.Date(c(begi[i],
dates[i+length(begi)])))
}
If I ran the i
one by one I obtain the wanted result. However, if I ran the loop the last i
, it overwrites the entire column instead of using only the position indicated by ind
. Why?
Upvotes: 1
Views: 269
Reputation: 28675
You can format your dates as %m-%d
and use string comparison
df$value <- as.numeric(format(df$date, '%m-%d') %between% c('07-15', '12-31'))
df
# date value
# 1 2015-04-16 0
# 2 2014-08-04 1
# 3 2013-09-11 1
# 4 2015-11-20 1
# 5 2014-04-04 0
Upvotes: 1
Reputation: 1015
df$leapyear <- ifelse(as.integer(format(df$date, "%Y")) %% 4 == 0, 1, 0)
df$leapyear <- ifelse(as.integer(format(df$date, "%Y")) %% 100 == 0, 0, 1)
df$leapyear <- ifelse(as.integer(format(df$date, "%Y")) %% 400 == 0, 1, 0)
df[format(df$date, "%j") > ifelse(df$leapyear == 0, 197, 196), ]
thanks to @RyanD for pointing out that subsetting based on day of year fails to take leap years into account
Upvotes: 0