python_noob_5
python_noob_5

Reputation: 51

How to add rows as sums of other rows in DataFrame?

I'm not sure I titled this post correctly but I have a unique situation where I want to append a new set of rows to an existing DataFrame as a sum of rows from existing sets and I'm not sure where to start.

For example, I have the following DataFrame:

import pandas as pd

data = {'Team': ['Atlanta', 'Atlanta', 'Cleveland', 'Cleveland'],
        'Position': ['Defense', 'Kicker', 'Defense', 'Kicker'],
        'Points': [5, 10, 15, 20]}

df = pd.DataFrame(data)

print(df)

        Team Position  Points
0    Atlanta  Defense       5
1    Atlanta   Kicker      10
2  Cleveland  Defense      15
3  Cleveland   Kicker      20

How do I create/append new rows which create a new position for each team and sum the points of the two existing positions for each team? Additionally, the full dataset consists of several more teams so I'm looking for a solution that will work for any number of teams.

edit: I forgot to include that there are other positions in the complete DataFrame; but, I only want this solution to be applied for the positions "Defense" and "Kicker".

My desired output is below.

            Team Position  Points
        Team          Position  Points
0    Atlanta           Defense       5
1    Atlanta            Kicker      10
2  Cleveland           Defense      15
3  Cleveland            Kicker      20
4    Atlanta  Defense + Kicker      15
5  Cleveland  Defense + Kicker      35

Thanks in advance!

Upvotes: 1

Views: 261

Answers (2)

F Humphreys
F Humphreys

Reputation: 41

pandas.DataFrame.append is deprecated since version 1.4.0. Henry Ecker's neat solution just needs a slight tweak to use concat instead.

whitelisted_positions = ['Kicker', 'Defense']
df = pd.concat([df,
                df[df['Position'].isin(whitelisted_positions)]
                .groupby('Team', as_index=False).agg({
                    'Position': ' + '.join,  # Concat Strings together
                    'Points': 'sum'  # Total Points
                })],
               ignore_index=True)

Upvotes: 0

Henry Ecker
Henry Ecker

Reputation: 35646

We can use groupby agg to create the summary rows then append to the DataFrame:

df = df.append(df.groupby('Team', as_index=False).agg({
    'Position': ' + '.join,  # Concat Strings together
    'Points': 'sum'  # Total Points
}), ignore_index=True)

df:

        Team          Position  Points
0    Atlanta           Defense       5
1    Atlanta            Kicker      10
2  Cleveland           Defense      15
3  Cleveland            Kicker      20
4    Atlanta  Defense + Kicker      15
5  Cleveland  Defense + Kicker      35

We can also whitelist certain positions by filtering df before groupby to aggregate only the desired positions:

whitelisted_positions = ['Kicker', 'Defense']
df = df.append(
    df[df['Position'].isin(whitelisted_positions)]
        .groupby('Team', as_index=False).agg({
        'Position': ' + '.join,  # Concat Strings together
        'Points': 'sum'  # Total Points
    }), ignore_index=True
)

Upvotes: 1

Related Questions