Kvothe
Kvothe

Reputation: 1393

pandas time difference between rows based on conditions

So, I have a dataframe like this:

d = {'id': ['a','a','b','b'], 'map': ['aa','ab','ba','bb'], 'timestamp':['2019-01-01 00:00:00+00:00', 
                                                                     '2019-01-01 06:00:00+00:00',
                                                                     '2019-05-01 12:00:00+00:00',
                                                                     '2019-06-01 18:00:00+00:00']}
df = pd.DataFrame(data=d)

    id  map timestamp
0   a   aa  2019-01-01 00:00:00+00:00
1   a   ab  2019-01-01 06:00:00+00:00
2   b   ba  2019-05-01 12:00:00+00:00
3   b   bb  2019-06-01 18:00:00+00:00

For each value in id, I'd like to calculate the time difference (i.e the difference between min and max timestamp) for each value in map. Eg. id = a and map = aa = 6 hours

Appreciate any help

Upvotes: 1

Views: 474

Answers (1)

jezrael
jezrael

Reputation: 863611

Use:

df['timestamp'] = pd.to_datetime(df['timestamp'])

df1 = df.groupby('id')['timestamp'].agg(['max','min'])

s = df1['max'].sub(df1['min']).dt.total_seconds().div(3600)
print (s)
id
a      6.0
b    750.0
dtype: float64

Upvotes: 1

Related Questions