How to calculate the difference of timestamps based on a multi level index?

Question

I've got the following pandas.DataFrame and would like to calculate a new column containing the timedelta between consecutive timestamps in the multi-index level Timestamp:

import pandas as pd
import numpy as np
data = {'Timestamp': [12, 12, 12, 22, 22, 22, 44, 44, 66, 102],
        'Customer': ['bmw', 'vw', 'vw', 'bmw', 'vw', 'vw', 'vw', 'vw', 'bmw', 'bmw'],
        'Series': ['series1', 'series1', 'series2', 'series1', 'series1', 'series2', 'series1', 'series2', 'series2', 'series1'],
        'time_delta': [np.nan, np.nan, np.nan, 10, 10, 10, 22, 22, 22, 36]
        }
df = pd.DataFrame(data).set_index(['Timestamp', 'Customer', 'Series'])

The column time_delta is the desired output I would like to achieve. I somewhat struggle since I can not use the pandas.Series.diff() function as the periods are not consistent. I want to do the timestamp delta calculation on the Timestamp level of the dataframe, but pass the result to all rows of this level. So for the first Timestamp level value 12 there is no preceeding timestamp value, thus all rows for this timestamp are filled with np.nan. For the next timestamp 22, I can take the delta to 12 (which is 10) and fill it for all rows of timestamp 22.

Quang Hoang · Accepted Answer

Let's try extracting the level values and calculate the difference from there:

df['time_delta'] = df.index.get_level_values('Timestamp')

s = df['time_delta'].diff()
df['time_delta'] = s.where(s>0).ffill()

Output:

                            time_delta
Timestamp Customer Series             
12        bmw      series1         NaN
          vw       series1         NaN
                   series2         NaN
22        bmw      series1        10.0
          vw       series1        10.0
                   series2        10.0
44        vw       series1        22.0
                   series2        22.0
66        bmw      series2        22.0
102       bmw      series1        36.0

How to calculate the difference of timestamps based on a multi level index?

Answers (1)

Related Questions