Reputation: 1987
I have a dataframe that looks like the following. The rightmost column is my desired column:
Group Value Target_CumSum
1 3 0
1 2 2
1 5 7
1 4 11
2 1 0
2 5 5
2 9 14
2 3 17
How do I perform the cumsum()
from the second element of each group as opposed to the very first one?
df = pd.DataFrame({'Group': [1,1,1,1,2,2,2,2], 'Value': [3,2,5,4,1,5,9,3], 'Target_CumSum': [0,2,7,11,0,5,14,17]})
#df['MyCumSum']= df.groupby(['Group'])['Value'].cumsum()
Upvotes: 1
Views: 177
Reputation: 153460
Just wanted to offer another solution:
df['Value'].where(df['Group'].duplicated(), 0).groupby(df.Group).cumsum()
Output:
0 0
1 2
2 7
3 11
4 0
5 5
6 14
7 17
Name: Value, dtype: int64
Upvotes: 1
Reputation: 2647
In [87]: df.groupby(['Group']).apply(lambda x: x['Value'].shift(-1).cumsum().shift().fillna(0))
Out[87]:
Group
1 0 0.0
1 2.0
2 7.0
3 11.0
2 4 0.0
5 5.0
6 14.0
7 17.0
Name: Value, dtype: float64
Upvotes: 0
Reputation: 323306
IIUC
g=df.groupby('Group').Value
g.cumsum()-g.transform('first')
Out[597]:
0 0
1 2
2 7
3 11
4 0
5 5
6 14
7 17
Name: Value, dtype: int64
Upvotes: 3
Reputation: 2438
i dont think theres a built in function for that. so you would have to make a custom function and apply it. hope it helps.
def custom_cumsum (X):
X[1:] = np.cumsum(X[1:])
X.iloc[0] = 0
return X
df['cumsum'] = df.groupby('Group')['Value'].apply(custom_cumsum)
Upvotes: 1