Reputation: 3275
I have some data that are timestamped every minute that look like this:
date time_greece gmt_offset_greece price_greece time_and_date
gmt_offset_greece_test time_and_date_correct time_and_date_difference ID
1 2009-12-01 08:30:04.548 +2 2275.32 2009-12-01 08:30:04.548 2 2009-12-01 06:30:04 0 1
I want to perform different analyses for 5 minutes, 30 minutes intervals etc. At the moment I have created an ID based on the modulo operation between the row number and the 30 (I would do something similar for 5 minutes intervals etc.)
statadata$ID <- seq.int(nrow(statadata))
statadata$ID <- seq.int(nrow(statadata)) %% 30
My question is, is there a more efficient way to implement this than the one I am currently using, that I haven't thought of / don't know?
Upvotes: 0
Views: 33
Reputation: 5766
The package lubridate
as a rounding function for date and datetimes, that can round to an arbitrary time unit, e.g. 5 minutes, 30 minutes, as well as floor and ceiling. With this you should be able to define your intervals as simple as lubridate::round(date_time_greece, '5 minutes')
.
As with all binning operations for data analysis, pay attention your groups. I.e. does your grouping/binning create many groups with just a single data point.
Upvotes: 0