Reputation: 109
I want to apply some statistics on records within a time window with an offset. My data looks something like this:
lon lat stat ... speed course head
ts ...
2016-09-30 22:00:33.272 5.41463 53.173161 15 ... 0.0 0.0 511
2016-09-30 22:01:42.879 5.41459 53.173180 15 ... 0.0 0.0 511
2016-09-30 22:02:42.879 5.41461 53.173161 15 ... 0.0 0.0 511
2016-09-30 22:03:44.051 5.41464 53.173168 15 ... 0.0 0.0 511
2016-09-30 22:04:53.013 5.41462 53.173141 15 ... 0.0 0.0 511
[5 rows x 7 columns]
I need the records within time windows of 600 seconds, with steps of 300 seconds. For example, these windows:
start end
2016-09-30 22:00:00.000 2016-09-30 22:10:00.000
2016-09-30 22:05:00.000 2016-09-30 22:15:00.000
2016-09-30 22:10:00.000 2016-09-30 22:20:00.000
I have looked at Pandas rolling to do this. But it seems like it does not have the option to add the offset which I described above. Am I overlooking something, or should I create a custom function for this?
Upvotes: 0
Views: 1914
Reputation: 434
What you want to achieve should be possible by combining DataFrame.resample
with DataFrame.shift
.
import pandas as pd
index = pd.date_range('1/1/2000', periods=9, freq='T')
series = pd.Series(range(9), index=index)
df = pd.DataFrame(series)
That will give you a primitive timeseries (example taken from api docs DataFrame.resample).
2000-01-01 00:00:00 0
2000-01-01 00:01:00 1
2000-01-01 00:02:00 2
2000-01-01 00:03:00 3
2000-01-01 00:04:00 4
2000-01-01 00:05:00 5
2000-01-01 00:06:00 6
2000-01-01 00:07:00 7
2000-01-01 00:08:00 8
Now resample by your step size (see DataFrame.shift).
sampled = df.resample('90s').sum()
This will give you non-overlapping windows of the step size.
2000-01-01 00:00:00 1
2000-01-01 00:01:30 2
2000-01-01 00:03:00 7
2000-01-01 00:04:30 5
2000-01-01 00:06:00 13
2000-01-01 00:07:30 8
Finally, shift the sampled df by one step and sum with the previously created df. Window size being twice the step size helps.
sampled.shift(1, fill_value=0) + sampled
This will yield:
2000-01-01 00:00:00 1
2000-01-01 00:01:30 3
2000-01-01 00:03:00 9
2000-01-01 00:04:30 12
2000-01-01 00:06:00 18
2000-01-01 00:07:30 21
There may be a more elegant solution, but I hope this helps.
Upvotes: 1