Malavika Venkatesh
Malavika Venkatesh

Reputation: 49

group and perform calculations

I have a dataset that looks like this.

ID BRAND_NAME PRICE COLOR ACTUAL PREDICTED MAPE ACCURACY
1   mjjk       657   Black  0     1         0    1
1   mjjk       657   Black  1     1         0    1

I want by dataset to be grouped by ID and have it like this:

ID BRAND_NAME PRICE COLOR ACTUAL PREDICTED MAPE ACCURACY
1   mjjk       657   Black  1    2         1    0

where Actual and Predicted columns has the sum of actual values of the particular ID and mape and accuracy are calculated using the sum of actual and predicted values like MAPE=(Predicted-Actual)/Actual and Accuracy is 1-MAPE

am now stuck with how to group by id and get the corresponding values

Upvotes: 0

Views: 71

Answers (2)

user7864386
user7864386

Reputation:

Use groupby.agg. Group by ID and take the first of 'BRAND_NAME', 'PRICE', and 'COLOR' and sum of 'ACTUAL' and 'PREDICTED'. Then do computations on the output dataframe to obtain 'MAPE' and 'ACCURACY' columns:

out = df.groupby('ID').agg({'BRAND_NAME':'first','PRICE':'first','COLOR':'first', 'ACTUAL':'sum', 'PREDICTED':'sum'}).reset_index()
out['MAPE'] = (out['PREDICTED'] - out['ACTUAL']) / out['ACTUAL']
out['ACCURACY'] = 1 - out['MAPE']

Output:

   ID BRAND_NAME  PRICE  COLOR  ACTUAL  PREDICTED  MAPE  ACCURACY
0   1       mjjk    657  Black       1          2   1.0       0.0

Upvotes: 1

Mayank Porwal
Mayank Porwal

Reputation: 34046

Use Groupby.agg:

In [121]: res = df.groupby(['ID', 'BRAND_NAME', 'COLOR'], as_index=False).agg({'PRICE':max, 'ACTUAL': sum, 'PREDICTED': sum})

In [123]: res['MAPE'] = (res.PREDICTED - res.ACTUAL).div(res.ACTUAL)
In [125]: res['ACCURACY'] = 1 - res.MAPE

In [126]: res
Out[126]: 
   ID BRAND_NAME  COLOR  PRICE  ACTUAL  PREDICTED  MAPE  ACCURACY
0   1       mjjk  Black    657       1          2   1.0       0.0

Upvotes: 1

Related Questions