G H
G H

Reputation: 116

Python Pandas Average and Sum conflicts

I have a table that looks like this:

table

The average and total rows are being calculated like so:

df1.loc["Average"] = df1.mean()
df1.loc["Total"] = df1.sum()

Now, I realized that the problem here is that the Average is calculating properly, but the sum is including the Average row as well, which is not what I want.

Ideally, I'd like to see something more like a single .loc row that has sum() applied to

`columns` ['Enageable R', 'R Responses', 'R Response Rate', 'Engageable Q',
                                'Q Responses', 'Q Response Rate']

and mean() applied to columns ['R Response Rate', 'Q Response Rate']

So I would love to see something like this:

Brand Engageable R R Responses Response Rate
Brand1 34 34 100.00%
Brand2 34 34 100.00%
Brand3 34 34 100.00%
Total 102 102 100.00%

Upvotes: 2

Views: 1085

Answers (3)

jezrael
jezrael

Reputation: 862611

Use DataFrame.agg for processing both function separately and then add new rows by DataFrame.append:

df = df.append(df.agg(['sum','mean']).rename({'sum':'Total','mean':'Average'}))

If need procesing only some columns:

cols = ['Enageable R', 'R Responses', 'R Response Rate', 'Engageable Q',
                    'Q Responses', 'Q Response Rate']

df = df.append(df[cols].agg(['sum','mean']).rename({'sum':'Total','mean':'Average'}))

Upvotes: 1

G H
G H

Reputation: 116

Here's what I found, with some credit due to @Wilian for the recommendation, although if there's a more efficient way to do this, let me know.

So I have a df1, and I created a blank df2 using:

df2 = pd.DataFrame(columns=['Enageable R', 'R Responses', 'R Response Rate', 'Engageable Q',
                        'Q Responses', 'Q Response Rate'])

Then I filled df2 with:

df2.loc["Average"] = df1.mean()
df2.loc["Total"] = df1.sum()

Then I appended df2 to df1

df1 = df1.append(df2)

Which created the following:

tablerevised

So the sum and average columns are not including each other in the calculations of the other rows.

Upvotes: 0

Wilian
Wilian

Reputation: 1257

Use a copy:

df2 = df1.copy(deep=True)

df2["Average"] = df1.mean()
df2["Total"] = df1.sum()

EDIT

your post is a little confused.

df1["Total"] = df1.sum()
df1["Average"] = df1[['R Response Rate', 'Q Response Rate']].mean(axis=1)

Upvotes: 0

Related Questions