Reputation: 1716
Two dataframes, I want to group by rows, column a and b. I want to sum all the rows that has 'x' and 'z' in column b and call it 'v'.
import pandas as pd
df = pd.DataFrame({'a': ['day1','day1','day1','day2','day2','day2'], 'b': ['x','z','y','x','z','y'], 'c':[9,1,6,0,5,6], 'd':[6,9,1,3,5,9]})
The result should be:
df2 = pd.DataFrame({'a': ['day1','day1','day2','day2'], 'b': ['v','y','v','y'], 'c':[10,6,5,6], 'd':[15,1,8,9]})
Upvotes: 1
Views: 1438
Reputation: 210832
In [106]: df.replace({'b':{'x':'v','z':'v'}}).groupby(['a','b'], as_index=False).sum()
Out[106]:
a b c d
0 day1 v 10 15
1 day1 y 6 1
2 day2 v 5 8
3 day2 y 6 9
or
In [114]: to_replace=['x','z']
In [115]: replace_with=['v','v']
In [116]: df.assign(b=df.b.replace(to_replace, replace_with)).groupby(['a','b'], as_index=False).sum()
Out[116]:
a b c d
0 day1 v 10 15
1 day1 y 6 1
2 day2 v 5 8
3 day2 y 6 9
Upvotes: 2