Reputation: 12108
I have a DataFrame
like this:
df
:
fruit val1 val2
0 orange 15 3
1 apple 10 13
2 mango 5 5
How do I get Pandas to give me a cumulative sum and percentage column on only val1
?
Desired output:
df_with_cumsum
:
fruit val1 val2 cum_sum cum_perc
0 orange 15 3 15 50.00
1 apple 10 13 25 83.33
2 mango 5 5 30 100.00
I tried df.cumsum()
, but it's giving me this error:
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Upvotes: 76
Views: 106145
Reputation: 21
The above answer is good, but out of date. I have updated it so that it works.
df['cum_sum'] = df['val1'].cumsum()
df['cum_perc'] = round((df.cum_sum/df['val1'].sum())*100,2)
Upvotes: 0
Reputation: 251618
df['cum_sum'] = df['val1'].cumsum()
df['cum_perc'] = 100*df['cum_sum']/df['val1'].sum()
This will add the columns to df
. If you want a copy, copy df
first and then do these operations on the copy.
Upvotes: 131
Reputation: 91
It's a good answer, but written in 2014. I just modified a little bit, so it can pass the compiler and results looks similar to the example.
df['cum_sum'] = df["val1"].cumsum()
df['cum_perc'] = round(100*df.cum_sum/df["val1"].sum(),2)
Upvotes: 8