Reputation: 185
I am working with a hourly time series (Date, Time (hr), P) and trying to calculate the proportion of daily total 'Amount' for each hour. I know I can us Pandas' resample('D', how='sum') to calculate the daily sum of P (DailyP) but in the same step, I would like to use the daily P to calculate proportion of daily P in each hour (so, P/DailyP) to end up with an hourly time series (i.e., same frequency as original). I am not sure if this can even be called 'resampling' in Pandas term. This is probably apparent from my use of terminology, but I am an absolute newbie at Python or programming for that matter. If anyone can suggest a way to do this, I would really appreciate it. Thanks!
Upvotes: 4
Views: 4485
Reputation: 139242
A possible approach is to reindex the daily sums back to the original hourly index (reindex
) and filling the values forward (so that every hour gets the value of the sum of that day, fillna
):
df.resample('D', how='sum').reindex(df.index).fillna(method="ffill")
And this you can use to divide your original dataframe with.
An example:
>>> import pandas as pd
>>> import numpy as np
>>>
>>> df = pd.DataFrame({'P' : np.random.rand(72)}, index=pd.date_range('2013-05-05', periods=72, freq='h'))
>>> df.resample('D', 'sum').reindex(df.index).fillna(method="pad")
P
2013-05-05 00:00:00 14.049649
2013-05-05 01:00:00 14.049649
...
2013-05-05 22:00:00 14.049649
2013-05-05 23:00:00 14.049649
2013-05-06 00:00:00 13.483974
2013-05-06 01:00:00 13.483974
...
2013-05-06 23:00:00 13.483974
2013-05-07 00:00:00 12.693711
2013-05-07 01:00:00 12.693711
..
2013-05-07 22:00:00 12.693711
2013-05-07 23:00:00 12.693711
Upvotes: 5