Reputation: 727
I want to make six columns ('ID', 'Earliest', 'Latest', 'Difference_hours', 'Difference_minutes', 'Serial') out of this partial dataframe:
ID Timestamp Serial
A 2019-01-01 00:01 F
A 2019-01-01 00:04 F
A 2019-01-01 00:06 F
B 2019-01-03 12:35 G
B 2019-01-01 13:35 G
.
.
.
The partial desired output is:
ID Earliest Latest Difference_hours Difference_minutes Serial
A 2019-01-01 00:01 2019-01-01 00:06 0.083 5 F
A 2019-01-03 12:35 2019-01-04 13:35 1 60 G
.
.
.
Column Description:
'Earliest':The earliest timestamp from the same 'ID',
'Latest': The latest from the same 'ID' ,
'Difference_hours': Timestamp difference between 'Earliest' and 'Latest' in hours,
'Difference_minutes': Timestamp difference between 'Earliest' and 'Latest' in minutes .
Note: 'Serial' is constant for every unique 'ID'
I know I need to use groupby
and maybe with transform
, but don't know how to do it exactly.
Upvotes: 0
Views: 143
Reputation: 17164
This should work:
df = pd.DataFrame({'ID': ['A', 'A', 'A', 'B', 'B'],
'Timestamp': ['2019-01-01 00:01', '2019-01-01 00:04',
'2019-01-01 00:06', '2019-01-03 12:35', '2019-01-04 13:35'],
'Serial': ['F', 'F', 'F', 'G', 'G']})
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
print(df)
ID Timestamp Serial
0 A 2019-01-01 00:01:00 F
1 A 2019-01-01 00:04:00 F
2 A 2019-01-01 00:06:00 F
3 B 2019-01-03 12:35:00 G
4 B 2019-01-04 13:35:00 G
ans = df.groupby(['ID','Serial'])['Timestamp'].agg(['min','max']).reset_index()
ans['hr_diff'] = ans['max'].dt.hour - ans['min'].dt.hour
ans['min_diff'] = ans['max'].dt.minute - ans['min'].dt.minute
print(ans)
ID Serial min max hr_diff min_diff
0 A F 2019-01-01 00:01:00 2019-01-01 00:06:00 0 5
1 B G 2019-01-03 12:35:00 2019-01-04 13:35:00 1 0
ans = df.groupby(['ID','Serial'])['Timestamp'].agg(min='min',
max='max',
diff_hr=lambda x: x.max().hour - x.min().hour,
diff_min=lambda x: x.max().minute - x.min().minute
)
ans.reset_index()
ID Serial min max diff_hr diff_min
0 A F 2019-01-01 00:01:00 2019-01-01 00:06:00 0 5
1 B G 2019-01-03 12:35:00 2019-01-04 13:35:00 1 0
Upvotes: 1