Reputation: 568
I have a multi-index dataframe below:
vessels_df.head(10)
eta_date
vessel load_dates
AM OCEAN SILVER 2020-06-05 2020-06-04
2020-06-06
2020-06-07
2020-06-08
2020-06-09
2020-06-10
2020-06-11
APJ ANGAD
2020-06-09 2020-06-08
2020-06-10
2020-06-11
AQUATONKA
2020-06-03 2020-06-02
2020-06-04
2020-06-05
2020-06-06
2020-06-07
2020-06-08
2020-06-09
2020-06-10
2020-06-11
and a dictionary with a list of daily charges that are incurred for every day beyond the FIRST day of the load_dates
demurrage_charges_dict = {
'AM OCEAN SILVER': 11076,
'APJ ANGAD': 21771,
'AQUATONKA': 14312
}
desired output
I would like to create a column that is the cumulative sum of the daily charge over the period in the index, for example:
eta_date demurrage_charges
vessel load_dates
AM OCEAN SILVER 2020-06-05 2020-06-04 0
2020-06-06 11,076
2020-06-07 22,152
2020-06-08 33,228
2020-06-09 44,304
2020-06-10 55,380
2020-06-11 66,456
I believe I could reset the index of the 'vessels_df'
, convert the demurrage_charges_dict
to df and merge the two then use pd.cumsum()
, but wondered if there is a more elegant way to perform this?
Much appreciated.
Upvotes: 0
Views: 200
Reputation: 59549
cumcount
the vessel index level and multiply that by the mapping of the vessel with the dict
idx = df.index.get_level_values('vessel')
df['demurrage_charges'] = (df.groupby(idx).cumcount()
* idx.map(demurrage_charges_dict))
eta_date demurrage_charges
vessel load_dates
AM OCEAN SILVER 2020-06-05 2020-06-04 0
2020-06-06 None 11076
2020-06-07 None 22152
2020-06-08 None 33228
2020-06-09 None 44304
2020-06-10 None 55380
2020-06-11 None 66456
APJ ANGAD 2020-06-09 2020-06-08 0
2020-06-10 None 21771
2020-06-11 None 43542
AQUATONKA 2020-06-03 2020-06-02 0
2020-06-04 None 14312
2020-06-05 None 28624
2020-06-06 None 42936
2020-06-07 None 57248
2020-06-08 None 71560
2020-06-09 None 85872
2020-06-10 None 100184
2020-06-11 None 114496
Upvotes: 3