Reputation: 504
+--------------------------------------------------------------+
| 2014-08-12T10:30:14.6938893+10:00 Reading received START |
| 2014-08-12T10:30:14.6938893+10:00 Reading received ADD |
| 2014-08-12T10:30:14.7094893+10:00 Reading received UPDATE |
| 2014-08-12T10:30:14.7094893+10:00 Reading received COMMIT |
| 2014-08-12T10:30:14.7094893+10:00 Commit start |
| 2014-08-12T10:30:14.7406893+10:00 Commit end |
| 2014-08-12T10:30:14.7406893+10:00 Reading received FINISH |
| 2014-08-12T10:30:23.3206893+10:00 Reading received START |
| 2014-08-12T10:30:23.3206893+10:00 Reading received ADD |
| 2014-08-12T10:30:23.3362893+10:00 Reading received UPDATE |
| 2014-08-12T10:30:23.3362893+10:00 Reading received COMMIT |
| 2014-08-12T10:30:23.3362893+10:00 Commit start |
| 2014-08-12T10:30:23.3674893+10:00 Commit end |
| 2014-08-12T10:30:23.3674893+10:00 Reading received FINISH |
+--------------------------------------------------------------+
Given a time series where the value describes an event, how can I calculate delta times between recurring events, e.g. the average difference between Reading received START and the subsequent Reading received FINISH?
Is there a better way than then e.g.
left = df[df.Event == 'Reading received START']
right = df[df.Event == 'Reading received FINISH']
left.index = range(len(left))
right.index = range(len(right))
delta = (right.Time - left.Time)
Upvotes: 1
Views: 1566
Reputation: 12019
To be explicit, I'm assuming that you are showing the index and one column (called 'Event') from a larger dataframe. Is that correct? How about the following:
relevant_df = df[df.Event.isin(['Reading received START','Reading received START'])
relevant_ts_as_series = pd.Series(relevant_df.index)
diff = relevant_ts_as_series - relevant_ts_as_series.shift()
Then you can take diff.mean()
if you like.
I bet there's a more elegant way than turning the index into a Series, but this should work for you.
Upvotes: 2