Reputation: 1612
I have created the following dataframe:
d = {'x': [0,0,1,1,1,1,1,2,2,2], 'y': [67,-5,78,47,88,12,-4,14,232,28]}
df = pd.DataFrame(data=d)
print(df)
which looks like this:
I want to calculate a column "z" which is the cumulative of column "y" by column "x". So, I calculate the cumulative distribution as long as x is of the same value. The resulting dataframe should look like this:
So, when the column X changes value a new cumulative distribution is calculated.
How can I do that in python?
Upvotes: 0
Views: 4405
Reputation: 7903
cumsum
is what you are searching for :
df['z'] = df.groupby('x')['y'].cumsum()
Upvotes: 1