Reputation: 555
Let's say I have a panda like that:
2010-01-01 04:10:00:025 69
2010-01-01 04:10:01:669 1
2010-01-01 04:10:03:027 3
2010-01-01 04:10:04:003 8
2010-01-01 04:10:05:987 10
2010-01-01 04:10:06:330 99
2010-01-01 04:10:08:369 55
2010-01-01 04:10:09:987 5000
2010-01-01 04:10:11:148 13
And I need convert it in a format as following:
2010-01-01 04:10:00:000 69 69
2010-01-01 04:10:05:000 5000 10
2010-01-01 04:10:10:000 13 13
The first column corresponds to each 5 seconds interval starting at 2010-01-01 04:10:00:000.
The second column is the max of all the grouped rows.
The third column is the first of all the grouped rows.
How can I get that?
Upvotes: 1
Views: 60
Reputation: 42946
Assuming you mean 5 seconds
, we can use pd.Grouper
with agg
and min, first
:
# use this line if your first column is not datetime type yet.
# df['col1'] = pd.to_datetime(df['col1'], format='%Y-%m-%d %H:%M:%S:%f')
df.groupby(pd.Grouper(key='col1', freq='5s'))['col2'].agg(['max', 'first']).reset_index()
Output
col1 max first
0 2010-01-01 04:10:00 69 69
1 2010-01-01 04:10:05 5000 10
2 2010-01-01 04:10:10 13 13
Note: since you didn't provide column names, I called them col1, col2
Upvotes: 3
Reputation: 10893
I'll assume your first column is datetime named: date_time
and your 2nd column in 'value
'. In order to reach your objective you can use the "resample" such that
df.set_index('date_time').resample("5s").agg(['max','min'])
*note I used 5 seconds because your data didn't show 5 minute
result
value
max min
date_time
2010-01-01 04:10:00 69 1
2010-01-01 04:10:05 5000 10
2010-01-01 04:10:10 13 13
Upvotes: 1