Reputation: 15369
I am trying to calculate the sequential recordings in a time series, and aggregate the data for these sequences.
Example Data
Here is an example of the data taken at a maximum frequency of 1 second:
timestamp Value
06:07:23 0.439
06:07:24 0.556
06:07:25 0.430
06:07:26 0.418
06:07:27 0.407
06:07:47 0.439
06:07:48 0.420
06:07:49 0.405
09:55:21 0.507
09:55:22 0.439
10:03:24 0.439
10:03:25 0.439
10:03:36 1.708
10:03:37 0.608
10:03:38 0.439
10:03:46 0.484
10:03:47 0.380
10:03:48 0.607
10:03:49 0.439
10:03:50 0.439
10:03:51 0.439
10:03:52 0.430
10:03:53 0.439
10:03:54 4.924
10:03:55 1.012
10:03:56 0.887
10:03:57 0.439
10:03:58 0.439
10:04:18 0.447
10:04:19 0.447
As can be seen, there are periods whereby a value is taken every second. I am trying to find a way to aggregate if there was no gap between the observations to end up with something as follows:
timestamp max duration
06:07:23 0.556 5
06:07:47 0.439 3
09:55:21 0.507 2
10:03:24 0.439 2
10:03:36 1.708 3
10:03:46 1.012 13
10:04:18 0.447 2
I am struggling to find a way of grouping the data by the sequential data. The closest answer I have been able to find is this one, however, the answers were provided over three and a half years ago and I was struggling to get the data.table
method working.
Any ideas much appreciated!
Upvotes: 1
Views: 80
Reputation: 93813
Here is an attempt in data.table
:
dat[,
.(timestamp = timestamp[1], max = max(Value), duration=.N),
by = cumsum(c(FALSE, diff(as.POSIXct(dat$timestamp, format="%H:%M:%S", tz="UTC")) > 1))
]
# cumsum timestamp max duration
#1: 0 06:07:23 0.556 5
#2: 1 06:07:47 0.439 3
#3: 2 09:55:21 0.507 2
#4: 3 10:03:24 0.439 2
#5: 4 10:03:36 1.708 3
#6: 5 10:03:46 4.924 13
#7: 6 10:04:18 0.447 2
Upvotes: 3