Reputation: 663
I have a dataframe like the following:
| int_col_1 || int_col2 || str_col3 || float_col4 || float_col5 |
|---------------||--------------||--------------||----------------||----------------|
| 123 || 456 || potato || 0.1 || -0.2 |
| 456 || 456 || potato || 0.1 || 5.0 |
| 456 || 456 || potato || 0.1 || -0.2 |
| ... || ... || ... || ... || ... |
I want to sum all float_col_4
and float_col5
rows that are above 0 per different int_col_1
and append the result as a new row.
So the new dataframe would look like this.
| int_col_1 || int_col2 || str_col3 || float_col4 || float_col5 |
|---------------||--------------||--------------||----------------||----------------|
| 123 || 789 || potato || 0.1 || -0.2 |
| Total || NULL || NULL || 0.1 || 0.0 |
| 456 || 734 || potato || 0.1 || 5.0 |
| 456 || 423 || potato || 0.1 || -0.2 |
| Total || NULL || NULL || 0.2 || 5.0 |
| ... || ... || ... || ... || ... |
How can I do this? I need that to be appended to the same dataframe for simplicity so I guess I will also have to cast the first column from int to str, or drop the "Total" to the str_col3
?
Thanks!
Upvotes: 3
Views: 208
Reputation: 862851
Use custom function in GroupBy.apply
for append new rows, for filter values above 0
is used DataFrame.clip
for replace lower values like 0
to 0
, create default index and replace mising values by DataFrame.fillna
with dict:
#if necessary sorting by first column
df = df.sort_values('int_col_1')
def f(x):
x.loc[-1] = x[['float_col4','float_col5']].clip(lower=0).sum()
#you can pass also mask function
#x.loc[-1] = x[['float_col4','float_col5']].mask(lambda x: x < 0, 0).sum()
return x
df = df.groupby('int_col_1').apply(f).reset_index(drop=True).fillna({'int_col_1':'Total'})
print (df)
int_col_1 int_col2 str_col3 float_col4 float_col5
0 123 456.0 potato 0.1 -0.2
1 Total NaN NaN 0.1 0.0
2 456 456.0 potato 0.1 5.0
3 456 456.0 potato 0.1 -0.2
4 Total NaN NaN 0.2 5.0
Upvotes: 2