stat_student
stat_student

Reputation: 827

How to group time series data by arbitrary dates in R?

I have a data.frame like the following:

df <- data.frame(
  DateTime = seq(ISOdate(2015, 1, 1, 0), by = 15 * 60, length.out = 35040),
  kWh = abs(rnorm(35040, mean = 550, sd = 50))
)

and a vector such as:

dates <- as.Date(c("2015-01-15", "2015-02-17", "2015-03-14", "2015-04-16", 
                   "2015-05-16", "2015-06-18", "2015-07-15", "2015-08-15",
                   "2015-09-16", "2015-10-13", "2015-11-17", "2015-12-17"))

What I want to do is add a column to df that indicates what accounting period each entry is attributed to. For example every entry from the beginning of the data through the last entry on 2015-01-14 would be given a value of 201501 because they are attributed to the January 2015 accounting period. Again, every value from from 2015-01-15 to the last value on 2015-02-16 would be given a value of 201502.

I was hoping that there would be a solution using lubridate as I'd rather not convert to an xts or zoo based object. Performance is also somewhat important as I will have to do this for a couple hundred such data sets.

Upvotes: 0

Views: 190

Answers (1)

stat_student
stat_student

Reputation: 827

I figured out the answer, I didn't realize cut also works with POSIXct objects.

df$Period <- cut(df$DateTime, breaks = as.POSIXct(dates), 
                 labels = 201502:201512)

It's important to convert the dates into POSIXct object because otherwise cut throws an error saying that they breaks are not formatted correctly.

Upvotes: 0

Related Questions