Reputation: 587
Given the following df
,
Account contract_date type item_id quantity price tax net_amount
ABC123 2020-06-17 P 1409 1000 0.355 10 400
ABC123 2020-06-17 S 1409 2000 0.053 15 150
ABC123 2020-06-17 C 1409 500 0.25 5 180
ABC123 2020-06-17 S 1370 5000 0.17 30 900
DEF456 2020-06-18 P 7214 3000 0.1793 20 600
I would like to turn df
, grouped by Account, contract_date and item_id. Then split the values of different types into different column. Intended results are as follows. I can do this with for loop/apply, but would like to seek for suggestion for groupby or pivot or any vectorized/pythonic solution to this. Intended results are as follows:
Account contract_date item_id quantity_P quantity_S quantity_C price_P price_S price_C tax_P tax_S tax_C net_amount_P net_amount_S net_amount_C
ABC123 2020-06-17 1409 1000 2000 500 0.355 0.053 0.25 10 15 5 400 150 180
ABC123 2020-06-17 1370 0 5000 0 0 0.17 0 0 30 0 0 900 0
DEF456 2020-06-18 7214 3000 0 0 0.1793 0 0 20 0 0 600 0 0
*Although it looks a bit off for the alignment, you may copy the df
and use df = pd.read_clipboard()
to read the table. Appreciate your help. Thank you.
Edit: The error I am getting using df.pivot(index=['Account', 'contract_date', 'item_id'], columns=['type'])
Upvotes: 1
Views: 257
Reputation: 34086
Use df.pivot
:
In [1660]: df.pivot(index=['Account', 'contract_date', 'item_id'], columns=['type'])
Out[1660]:
quantity price tax net_amount
type C P S C P S C P S C P S
Account contract_date item_id
ABC123 2020-06-17 1370 NaN NaN 5000.0 NaN NaN 0.170 NaN NaN 30.0 NaN NaN 900.0
1409 500.0 1000.0 2000.0 0.25 0.3550 0.053 5.0 10.0 15.0 180.0 400.0 150.0
DEF456 2020-06-18 7214 NaN 3000.0 NaN NaN 0.1793 NaN NaN 20.0 NaN NaN 600.0 NaN
Upvotes: 2