Reputation: 673
I have an excel data in below format:
Original Data Frame
Package FISCAL_YR SCENARIO PERIOD USD_AMT LY_USD_AMT CY_NetSales LY_NetSales
Canada 2021 Plan Per01 1.00 2.00 3.00 4.00
Africa 2021 Actual Per04 1.00 2.00 3.00 4.00
Africa 2021 Actual Per09 1.00 2.00 3.00 4.00
Brazil 2021 Plan Per11 1.00 2.00 3.00 4.00
Brazil 2021 Actual Per05 1.00 2.00 3.00 4.00
Africa 2021 Actual Per07 1.00 2.00 3.00 4.00
Mexico 2021 Plan Per10 1.00 2.00 3.00 4.00
Canada 2021 Actual Per02 1.00 2.00 3.00 4.00
In order to ease out my calculations, i am trying to get the values in scenario column to header with values from the last 4 columns to be appropriately assigned:
Expected dataframe:
Actual Plan
Package Sum of USD_AMT Sum of LY_USD_AMT Sum of CY_NetSales Sum of LY_NetSales Sum of USD_AMT Sum of LY_USD_AMT Sum of CY_NetSales Sum of LY_NetSales
Africa 3 6 9 12
Brazil 2 4 6 8
Canada 1 2 3 4 1 2 3 4
Mexico 1 2 3 4
I am trying the Pivot table option for this in pandas but it is rending the below output:
Failed Solution:
pd_piv=pd.pivot_table(df_dummy,index=['Package', 'FISCAL_YR', 'PERIOD'],
columns=['SCENARIO'],
values=['USD_AMT', 'LY_USD_AMT', 'CY_NetSales', 'LY_NetSales'], aggfunc=np.sum, fill_value=0)
pd_piv.head()
CY_NetSales LY_NetSales LY_USD_AMT USD_AMT
SCENARIO Actual Plan WRKG_FCST Actual Plan WRKG_FCST Actual Plan WRKG_FCST Actual Plan WRKG_FCST
Package_SubCategory FISCAL_YR_NBR FISCAL_PERIOD_NBR
*numbers are not shown since the actual data is quite different
Is there anyway to get the above shown expected dataframe?
Upvotes: 0
Views: 31
Reputation: 2819
Maybe this is what you are looking for: 1- Make your pivot table:
import pandas as pd
import numpy as np
data={"package":["Canada","Africa","Africa","Brazil","Brazil","Africa","Mexico","Canada"],
"scenario":["Plan","Actual","Actual","Plan","Actual","Actual","Plan","Actual"],
"USD_AMT":[1,1,1,1,1,1,1,1,],
"LY_USD_AMT":[1,1,1,1,1,1,1,1,]}
df=pd.DataFrame(data)
pd_piv=pd.pivot_table(df,index=['package'],
columns=['scenario'],
values=['USD_AMT', 'LY_USD_AMT',], aggfunc=np.sum,fill_value=0)
Result:
LY_USD_AMT USD_AMT
scenario Actual Plan Actual Plan
package
Africa 3 0 3 0
Brazil 1 1 1 1
Canada 1 1 1 1
Mexico 0 1 0 1
2- swap the index levels:
pd_piv.columns=pd_piv.columns.swaplevel(0, 1)
pd_piv.sort_index(axis=1, level=0, inplace=True)
Final result:
scenario Actual Plan
LY_USD_AMT USD_AMT LY_USD_AMT USD_AMT
package
Africa 3 3 0 0
Brazil 1 1 1 1
Canada 1 1 1 1
Mexico 0 0 1 1
Upvotes: 1