Reputation: 25
I've searched many questions to see if I could find an answer, but nothing quite fits what I am trying to do. So I have data that looks like this:
ID Area Start End
1 9 2016-06-30 2016-07-07
2 9 2016-07-01 2016-07-04
3 8 2016-06-21 2016-06-28
4 8 2016-06-23 2016-06-25
5 8 2016-06-25 2016-06-30
What I have done is created a data frame that has the entire date range for each area with a blank column for the aggregate number of days. Maybe there is an easier way to do this, or a package that can handle this scenario, but I am hoping to come up with something that looks like this:
Area Date Count
... 0
9 2016-06-30 1
9 2016-07-01 2
9 2016-07-02 2
9 2016-07-03 2
9 2016-07-04 2
9 2016-07-05 1
9 2016-07-06 1
9 2016-07-07 1
9 2016-07-08 0
... 0
8 2016-06-21 1
8 2016-06-22 1
8 2016-06-23 2
8 2016-06-24 2
8 2016-06-25 3
8 2016-06-26 2
8 2016-06-27 2
8 2016-06-28 2
8 2016-06-29 1
8 2016-06-30 1
... 0
Seems like there should be an easy way to handle an array like this in R, but I haven't found it yet.
Thanks in advance!
Upvotes: 1
Views: 150
Reputation: 49448
library(data.table)
dt = as.data.table(your_df) # or setDT to convert in-place
# convert dates to Dates (assuming they aren't)
dt[, Start := as.Date(Start, '%Y-%m-%d')]
dt[, End := as.Date(End, '%Y-%m-%d')]
# expand the dates, then aggregate
dt[, .(Date = seq(Start, End, by = 1), Area), by = ID][, .N, by = .(Date, Area)]
# Date Area N
#1: 2016-06-30 9 1
#2: 2016-07-01 9 2
#3: 2016-07-02 9 2
#4: 2016-07-03 9 2
#...
Upvotes: 2