Reputation: 1770
I have a pandas dataframe that looks like this
This data set spans several years and is minute-level data.
What I'd like to do is: for each day, apply a function that takes the sum of all logvol between 14:40:00 and 15:00:00.
I have a feeling it has to do with the resample function but I'm not sure exactly how to use it.
I thought, perhaps:
def fn():
# not sure how to pass a time slice into the function
data['logvol'].resample('D', how=fn)
Or:
data['logvol'].resample('D', how=lambda x: np.cumsum(x.between_time('14:40:00','15:00:00')))
Basically, I'm not sure what object is passed into fn(). Is it a row (ie. minute in this case)? Or is a set of all the minutes in the resampled day "D"?
Any hints in the right direction would be greatly appreciated.
Thanks!
Upvotes: 0
Views: 233
Reputation: 1770
I figured it out - I used:
data['logvol'].between_time('14:40:00','15:00:00').resample('D', how='sum')
Upvotes: 2