Reputation: 396
I am having trouble with constructing a Pandas pivot table.
I wish to have two values ['Balance', 'WAP']
under the same column ['Delivery']
.
Here is the DataFrame constructed from a dictionary:
dict_data = {
'Contract' : ['Contract 1', 'Contract 2', 'Contract 3', 'Contract 4'],
'Contract_Date': ['01/01/2019', '02/02/2019', '03/03/2019', '04/03/2019'],
'Delivery' : ['2019-01', '2019-01', '2019-02', '2019-03'],
'Price' : [90, 95, 100, 105],
'Balance': [50, 100, 150, 200]
}
df = pd.DataFrame.from_dict(dict_data)
df
The DataFrame:
Contract Contract_Date Delivery Price Balance
0 Contract 1 01/01/2019 2019-01 90 50
1 Contract 2 02/02/2019 2019-01 95 100
2 Contract 3 03/03/2019 2019-02 100 150
3 Contract 4 04/03/2019 2019-03 105 200
Calculate weighted average price:
# Create WAP - Weighted Average Price
df['Value'] = df['Balance'] * df['Price']
df['WAP'] = df['Value'] / df['Balance']
df
Pivot table construction:
# Use a dictionary to apply more than 1 type of aggregate onto the data
f = {'Balance': ['sum'], 'WAP': ['mean']}
df.pivot_table(
columns='Delivery',
values=['Balance', 'WAP'],
index=['Contract_Date', 'Contract'],
aggfunc=f
).replace(np.nan, '')
I am trying to get the 2 values to show up under the same column, for easier comparison, like the below table (manually constructed):
Delivery 2019-01 2019-02 2019-03
Contract Date Contract Balance WAP Balance WAP Balance WAP
01/01/2019 Contract 1 50 90
02/02/2019 Contract 2 100 95
03/03/2019 Contract 3 150 100
04/03/2019 Contract 4 200 105
Am thinking somewhere along the lines of stack/unstack for this problem? Would greatly appreciate any help as I am still rather new to Pandas..
Upvotes: 1
Views: 2424
Reputation: 862651
First convert from dictionary one elemment lists to strings for avoid 3 level MultiIndex:
f = {'Balance': 'sum', 'WAP': 'mean'}
And then use DataFrame.swaplevel
with DataFrame.sort_index
:
f = {'Balance': 'sum', 'WAP': 'mean'}
df = (df.pivot_table(
columns='Delivery',
values=['Balance', 'WAP'],
index=['Contract_Date', 'Contract'],
aggfunc=f
).replace(np.nan, '')
.swaplevel(1,0, axis=1)
.sort_index(axis=1))
print (df)
Delivery 2019-01 2019-02 2019-03
Balance WAP Balance WAP Balance WAP
Contract_Date Contract
01/01/2019 Contract 1 50 90
02/02/2019 Contract 2 100 95
03/03/2019 Contract 3 150 100
04/03/2019 Contract 4 200 105
Upvotes: 2