Reputation: 949
I have a large pandas dataframe containing columns timestamp, name, and value
index timestamp name value
0 1999-12-31 23:59:59.000107 A 16
1 1999-12-31 23:59:59.000385 B 12
2 1999-12-31 23:59:59.000404 C 25
3 1999-12-31 23:59:59.000704 B 15
4 1999-12-31 23:59:59.001281 A 300
5 1999-12-31 23:59:59.002211 C 20
6 1999-12-31 23:59:59.002367 C 3
I want to group by time buckets (say 20ms or 20 minutes) and name, and calculate the average value for each group.
What is the most efficient manner to do it?
Upvotes: 9
Views: 9511
Reputation: 11034
You can use pd.Grouper
, but it requires you to have the timestamps on the index. So you could try something like:
df.set_index('timestamp').groupby([pd.Grouper(freq='20Min'), 'name']).mean()
Upvotes: 19