Enigmatic
Enigmatic

Reputation: 4148

Calculation for more than one column in data frame

Target

I have only been using Pandas for a few days and am trying to use .loc and .mean() to calculate the average for multiple columns, and represent the value in a new row under the columns.


My Attempt

When finding the average of a single column, and using the following code...:

df.loc['Average', 'Column1'] = df['Column1'].mean()

^ ... The output is as expected.


However, When I attempt to add an additional column to find the average of, as such:

df.loc['Average', 'Column1', 'Column2'] = df['Column1', 'Column2'].mean()

I get the following error:

KeyError: ('Column1', 'Column2')

I'm assuming there is a very easy solution - I'm just pretty new at this stage.


Expected Output:

# ...... is replaced with numbers

            Column1       Column2
1           .......       .......
2           .......       .......
3           .......       .......
...         .......       .......
Average     #SomeFloat    #AnotherFloat

Upvotes: 1

Views: 523

Answers (2)

Prune
Prune

Reputation: 77847

mean operates only on a single input. Perhaps the most direct way to get the result you want is to use a list of columns:

df[['Column1', 'Column2']].mean()

You could also compute them individually, add those means, and divide by 2, but that's more typing.

Upvotes: 1

akuiper
akuiper

Reputation: 214957

You need to wrap multiple column names in a list:

df.loc['Average', ['Column1', 'Column2']] = df[['Column1', 'Column2']].mean()

Upvotes: 2

Related Questions