Reputation: 817
I have a situation where I want to calculate daily spend based on a contract start date, end date, and total spend. The next step is to sum up the daily spend amounts by month. This is quite easy, which I'll show I accomplish later. However, there's another date (inv date
); if the start
is before inv date
then all daily amounts prior to inv date
should be summed and included in the same period as the inv date
.
First, the easy part.
df = pd.DataFrame({'start': ['1/1/2018'],
'end': ['3/15/2018'],
'inv date': ['2/1/2018'],
'spend': [400]})
start end inv date spend
0 1/1/2018 3/15/2018 2/1/2018 400
Create the range of dates for the contract
prd = pd.period_range(df.loc[0, 'start'], df.loc[0, 'end'], freq='D')
prd = pd.Series(1, prd) # empty series to get the number of days in the monthly period
prd = prd.resample('M').size() * (df.loc[0, 'spend'] / prd.resample('M').size().sum())
Which gives me the following series: total spend allocated, by month, based on the number of days in the month...so far, so good.
prd
2018-01 167.567568
2018-02 151.351351
2018-03 81.081081
Freq: M, dtype: float64
The Hard Part (for me)
As I mentioned above, based on the inv date
, the series should actually look like this:
2018-01 0
2018-02 318.918919
2018-03 81.081081
Freq: M, dtype: float64
Because, the inv date
was after the start
date, so I don't want to add it until after inv date
. FYI: 318 = 167 + 151 (i.e., for the total in 2/2018 above).
Is this possible with resample
? If not, what's the most efficient way to accomplish this?
Upvotes: 1
Views: 677
Reputation: 11105
Starting with df
and prd
from the end of your "easy part" section:
# Get the inv date as a pandas Timestamp
invdate = pd.to_datetime(df['inv date'])[0]
oneday = pd.offsets.Day(1)
# Pandas slicing includes BOTH endpoints, so we need this one-day
# offset to get all values strictly before the inv date
prd.loc[invdate] = prd[invdate] + prd[:invdate - oneday].sum()
prd.loc[:invdate - oneday] = 0
# output
2018-01 0.000000
2018-02 318.918919
2018-03 81.081081
Freq: M, dtype: float64
Upvotes: 1