Sum rows of grouped data frame based on a specific column

Question

I have one data frame where I would like to create new column from the sum of different rows within one group df["NEW_Salary"]. If grouped df by the column Year, month & day , I want to for each group to sum the rows where Combination is True to the rows where Combination is False.

    import pandas as pd
    
 data = {"Year":[2002,2002,2002,2002,2002,2010,2010,2010,2010,2010],
        "Name":['Jason','Tom','KimJason','KimTom','Kim','Jason','Tom','KimJason','KimTom','Kim'],
        "Combination":[False,False,True,True,False,False,False,True,True,False],
        "Salary":[10,20,25,25,30,20,30,35,35,40]
        }
    df=pd.dataframe(data)

 Year  Month  Day        Name     Combination  Salary
0   2002      1   15     Jason        False      10
1   2002      1   15       Tom        False      20
2   2002      1   15  KimJason         True      25
3   2002      1   15    KimTom         True      25
4   2002      1   15       Kim        False      30
5   2010      3   20     Jason        False      20
6   2010      3   20       Tom        False      30
7   2010      3   20  KimJason         True      35
8   2010      3   20    KimTom         True      35
9   2010      3   20       Kim        False      40
10  2002      4    5      Mary        False      10
11  2002      4    5   MaryTom         True      20
12  2002      4    5       Tom        False      30

df["New_Salary"] would be created as following:

The row where Name is KimJason,Salary would be added to the Salary rows where Name is Kim & Jason
The row where Name is KimTom, Salary would be added again to the Salary rows where Name is Kim& Tom
The rows of KimTom & KimJason would be the same in the new column NEW_Salary as in Salary

The expected output:

       Year  Month  Day   Name  Combination  Salary    NEW_Salary
0   2002      1   15     Jason        False      10          35
1   2002      1   15       Tom        False      20          45
2   2002      1   15  KimJason         True      25          25
3   2002      1   15    KimTom         True      25          25
4   2002      1   15       Kim        False      30          80
5   2010      3   20     Jason        False      20          55
6   2010      3   20       Tom        False      30          65
7   2010      3   20  KimJason         True      35          35
8   2010      3   20    KimTom         True      35          35
9   2010      3   20       Kim        False      40         110
10  2002      4    5      Mary        False      10          30
11  2002      4    5   MaryTom         True      20          20
12  2002      4    5       Tom        False      30          50

Is there an easy way to achieve this output? no matter how many groups I have ?

Sum rows of grouped data frame based on a specific column

Answers (1)

Related Questions