Reputation: 601
I have this DataFrame in Python(assume that it sorted):
Col1 Col2 Col3
a 1 2
a 2 3
a 4 6
b 3 7
b 5 1
and I want to add a new column named Col4 where each value is the addition of the current Col3 and the previous row Col3.
It should run this logic by grouping Col1, and if it is the first row in the group it uses only the current value Col3.
So it should look like this:
Col1 Col2 Col3 Col4
a 1 2 2 (i.e. 2 - it is the first in first row group)
a 2 3 5 (i.e. 2+3)
a 4 6 9 (i.e. 3+6)
b 3 7 7 (i.e. 7 - it is the first row in second group)
b 5 1 8 (i.e. 7+1)
How can it be done in Python (hopefully using groupBy)?
Thanks
Upvotes: 2
Views: 1317
Reputation: 31171
You can use shift
per group:
df['Col4'] = df['Col3'] + df.groupby('Col1')['Col3'].shift(1).fillna(0)
>>> df
Col1 Col2 Col3 Col4
0 a 1 2 2.0
1 a 2 3 5.0
2 a 4 6 9.0
3 b 3 7 7.0
4 b 5 1 8.0
Upvotes: 2
Reputation: 323226
By using rolling
sum
df['col4']=df.groupby('Col1').Col3.rolling(2,min_periods=1).sum().values.astype(int)
df
Out[105]:
Col1 Col2 Col3 col4
0 a 1 2 2
1 a 2 3 5
2 a 4 6 9
3 b 3 7 7
4 b 5 1 8
Upvotes: 1