Reputation: 936
I have a dateframe that contains datetime and price.
Here is a sample chosen at random
In [2]: df
Out[2]:
price datetime
239035 5.05 2016-04-14 14:13:27
1771224 5.67 2016-08-30 14:19:47
2859140 4.00 2016-12-05 20:57:01
1311384 7.71 2016-07-08 18:16:22
141709 4.19 2016-04-07 13:30:00
2802527 3.94 2016-11-30 15:36:11
1411955 7.27 2016-07-20 13:55:20
2215987 4.87 2016-10-07 19:56:13
The datetime is accurate to the second.
I want to calculate the average price every 15 minutes starting at 9:00am and ending at 4:30pm, and store the new data into a new dataframe.
I could do it the old fashion way, make a list of all the 15 minute time intervals within 9am-4:30pm for each date, and iterate through each row of the CSV file, check its time and dump it into the appropriate bucket. Then find the average value for each bucket in each day.
But I was wondering if there is a nicer way to do this in panda. if not I'll just brute force my way through it...
Upvotes: 2
Views: 5497
Reputation: 11895
You can use DataFrame.resample:
df2 = df.resample(rule='15Min', on='datetime').mean()
Then you filter out the times you don't want using boolean indexing. It's better to work with a DateTimeIndex
:
df2 = df2.set_index('datetime', drop=False)
df2.between_time('9:00','16:30')
Upvotes: 6