Reputation: 204
Running into a real head-scratcher and not sure of how to resolve. Really hoping some of you may be able to help. Also, first time I've ever contributed to StackOverflow....yay!
library(tidyverse)
library(lubridate)
start_date <- ymd("2014-06-28")
end_date <- ymd("2019-06-30")
PayPeriod_EndDate <- seq(start_date, end_date, by = '2 week')
PayPeriod_Interval <- int_diff(PayPeriod_EndDate)
This creates a vector of intervals, with each interval representing a pay period of two weeks in length. This is part one, and part one is relatively easy (though still took awhile to figure out, ha).
Part two contains a vector of dates.
Dates <- c("2014-07-08", "2018-10-20", "2018-12-13", "2018-12-13", "2018-12-06", "2018-11-30", "2019-01-16", "2019-01-23", "2019-03-15", "2018-10-02")
I want to identify Dates %within%
Intervals, with the output being the interval that each date is within. So Date "2014-07-08"
will be assigned 2014-06-28 UTC--2014-07-12 UTC
, since this dates is within this interval.
A very similar problem seems to have been explored here...https://github.com/tidyverse/lubridate/issues/658
I have attempted the following
ymd(Dates) %within% PayPeriod_Interval
However, the result only calculates for the first element in the Dates vector. I have since tried various combinations of for loops, mutating into factors, etc... with little progress. This is work related so am really on a time-deficit and will be monitoring this post throughout the day and into the weekend.
Best and thank you! James
Upvotes: 1
Views: 1259
Reputation: 24069
The tidyverse is very useful but sometimes, base R is all you need. In this case the cut
function is all you need.
library(lubridate)
start_date <- ymd("2014-06-28")
end_date <- ymd("2019-06-30")
PayPeriod_EndDate <- seq(start_date, end_date, by = '2 week')
Dates <- c("2014-07-08", "2018-10-20", "2018-12-13", "2018-12-13", "2018-12-06", "2018-11-30", "2019-01-16", "2019-01-23", "2019-03-15", "2018-10-02")
startperiod<-cut(as.Date(Dates), breaks=PayPeriod_EndDate)
endperiod<-as.Date(startperiod)+13
The output from the cut function is the start date of each pay period which the "Dates" variable is located.
Upvotes: 4
Reputation: 7724
This is how a map
- solution could look like:
map(ymd(Dates), ~ PayPeriod_Interval[.x %within% PayPeriod_Interval])
# [[1]]
# [1] 2014-06-28 UTC--2014-07-12 UTC
#
# [[2]]
# [1] 2018-10-13 UTC--2018-10-27 UTC
#
# ...
To have the result as a interval vector (and not list) you can use:
PayPeriod_Interval[map_int(ymd(Dates), ~ which(.x %within% PayPeriod_Interval))]
# [1] 2014-06-28 UTC--2014-07-12 UTC 2018-10-13 UTC--2018-10-27 UTC 2018-12-08 UTC--2018-12-22 UTC 2018-12-08 UTC--2018-12-22 UTC 2018-11-24 UTC--2018-12-08 UTC
# [6] 2018-11-24 UTC--2018-12-08 UTC 2019-01-05 UTC--2019-01-19 UTC 2019-01-19 UTC--2019-02-02 UTC 2019-03-02 UTC--2019-03-16 UTC 2018-09-29 UTC--2018-10-13 UTC
If you are just interested in the end date of the interval an option is
PayPeriod_EndDate[map_int(ymd(Dates), ~ which.min(.x > PayPeriod_EndDate))]
# [1] "2014-07-12" "2018-10-27" "2018-12-22" "2018-12-22" "2018-12-08" "2018-12-08" "2019-01-19" "2019-02-02" "2019-03-16" "2018-10-13"
which.min
returns number of the entry of the first Date of PayPeriod_EndDate
that is not smaller than the specific date in the Dates-vector, thus the Date which is at the end of the specific payment period.
Upvotes: 1