Celius Stingher
Celius Stingher

Reputation: 18367

How to calculate inverse cumsum in pandas

I am trying to find a way to calculate an inverse cumsum for pandas. This means applying cumsum but from bottom to top. The problem I'm facing is, I'm trying to find the number of workable day for each month for Spain both from top to bottom (1st workable day = 1, 2nd = 2, 3rd = 3, etc...) and bottom to top (last workable day = 1, day before last = 2, etc...). So far I managed to get the top to bottom order to work but can't get the inverse order to work, I've searched a lot and couldn't find a way to perform an inverse cummulative sum:

import pandas as pd
from datetime import date
from workalendar.europe import Spain
import numpy as np
cal = Spain()
#print(cal.holidays(2019))
rng = pd.date_range('2019-01-01', periods=365, freq='D')
df = pd.DataFrame({ 'Date': rng})
df['flag_workable'] = df['Date'].apply(lambda x: cal.is_working_day(x))
df_workable = df[df['flag_workable'] == True]
df_workable['month'] = df_workable['Date'].dt.month
df_workable['workable_day'] = df_workable.groupby('month')['flag_workable'].cumsum()
print(df)
print(df_workable.head(30))

Output for January:

         Date  flag_workable  month  workable_day
1  2019-01-02           True      1           1.0
2  2019-01-03           True      1           2.0
3  2019-01-04           True      1           3.0
6  2019-01-07           True      1           4.0
7  2019-01-08           True      1           5.0

Example for last days of January:

         Date  flag_workable  month  workable_day
24 2019-01-25           True      1          18.0
27 2019-01-28           True      1          19.0
28 2019-01-29           True      1          20.0
29 2019-01-30           True      1          21.0
30 2019-01-31           True      1          22.0

This would be the expected output after applying the inverse cummulative:

         Date  flag_workable  month  workable_day  inv_workable_day
1  2019-01-02           True      1           1.0              22.0
2  2019-01-03           True      1           2.0              21.0
3  2019-01-04           True      1           3.0              20.0
6  2019-01-07           True      1           4.0              19.0
7  2019-01-08           True      1           5.0              18.0

Last days of January:

         Date  flag_workable  month  workable_day  inv_workable_day
24 2019-01-25           True      1          18.0               5.0
27 2019-01-28           True      1          19.0               4.0
28 2019-01-29           True      1          20.0               3.0
29 2019-01-30           True      1          21.0               2.0
30 2019-01-31           True      1          22.0               1.0

Upvotes: 2

Views: 3995

Answers (2)

CypherX
CypherX

Reputation: 7353

Solution

Whichever column you want to apply cumsum to you have two options:

  1. Order descending a copy of that column by index, followed by cumsum and then order ascending by index. Finally assign it back to the data frame column.

  2. Use numpy:

import numpy as np

array = df.column_data.to_numpy()    
array = np.flip(array)  # to flip the order 
array = np.cumsum(array)    
array = np.flip(array)  # to flip back to original order    
df.column_data_cumsum = array   

Upvotes: 3

ALollz
ALollz

Reputation: 59549

Invert the row order of the DataFrame prior to grouping so that the cumsum is calculated in reverse order within each month.

df['inv_workable_day'] = df[::-1].groupby('month')['flag_workable'].cumsum()
df['workable_day'] = df.groupby('month')['flag_workable'].cumsum()

#         Date  flag_workable  month  inv_workable_day  workable_day
#1  2019-01-02           True      1               5.0           1.0
#2  2019-01-03           True      1               4.0           2.0
#3  2019-01-04           True      1               3.0           3.0
#6  2019-01-07           True      1               2.0           4.0
#7  2019-01-08           True      1               1.0           5.0
#8  2019-02-01           True      2               1.0           1.0

Upvotes: 3

Related Questions