How to perform groupby in Pandas and compute mean of each row in original dataset

Question

I have a spreadsheet which has data in the following format:

Brand | Model    | Year | Cost  | Tax
--------------------------------------
Apple | iPhone 7 | 2017 | $1000 | $100

Apple | iphone 7 | 2018 | $800  |  $80

Xiomi | Note 5   | 2017 | $300  |  $30

Xiomi | Note 5   | 2018 | $200  |  $20

I want to transform the above data set to the following where I want to show Mean of Cost column when rows are grouped by ['Brand', 'Model'] and a Result column which is the sum of Mean and Tax column values:

Brand | Model    | Year | Cost  | Mean   | Tax    |  Result
------------------------------------------------------------ 
Apple | iPhone 7 | 2017 | $1000 | $900   | $100   |  $1000

Apple | iphone 7 | 2018 | $800  | $900   | $80    |  $980

Xiomi | Note 5   | 2017 | $300  | $250   | $30    |  $280

Xiomi | Note 5   | 2018 | $200  | $250   | $25    |  $275

I have been trying with groupby function but not getting a way to get the desired result as above.

Looking forward to your responses. Thank you.

jezrael · Accepted Answer

First convert values to integers with replace, get mean by transform, then sum and last convert back to strings if necessary:

cols = ['Cost','Tax']
df[cols] = df[cols].replace('\$','', regex=True).astype(int)
df['Mean'] = df.groupby(['Brand', 'Model'])['Cost'].transform('mean')

df['Result'] = df[['Mean','Tax']].sum(axis=1)
print (df)
   Brand     Model  Year  Cost  Tax  Mean  Result
0  Apple  iPhone 7  2017  1000  100  1000    1100
1  Apple  iphone 7  2018   800   80   800     880
2  Xiomi    Note 5  2017   300   30   250     280
3  Xiomi    Note 5  2018   200   20   250     270

And then:

cols1 = cols + ['Result', 'Mean']
df[cols1] = '$' + df[cols1].astype(str)
print (df)
   Brand     Model  Year   Cost   Tax   Mean Result
0  Apple  iPhone 7  2017  $1000  $100  $1000  $1100
1  Apple  iphone 7  2018   $800   $80   $800   $880
2  Xiomi    Note 5  2017   $300   $30   $250   $280
3  Xiomi    Note 5  2018   $200   $20   $250   $270

How to perform groupby in Pandas and compute mean of each row in original dataset

Answers (1)

Related Questions