Reputation: 10851
I have the following dataframe:
DECLARANT PARTNER PRODUCT PERIOD VALUE KG
FR XC 1001 200501 10 20
FR XC 1002 200501 11 21
FR XC 1001 200502 12 22
FR XC 1002 200502 13 23
FR XC 1003 200502 14 24
And I want it transformed into something like this:
DECLARANT PARTNER PRODUCT VALUE KG
FR XC 1001 [10,12] [20,22]
FR XC 1002 [11,13] [21,23]
FR XC 1003 [nan,14] [nan,24]
That is, the VALUE
and KG
is pivoted based on PERIOD
, and if it doesn't exist in the PERIOD
it should be replaced with nan
.
I've tried pivot
and groupby
. But don't seem to get them to work together. This kind-of gives me something but not really what I'm looking for:
df.pivot_table(index=['DECLARANT','PARTNER','PRODUCT'],
columns=['PERIOD'],
values=['KG', 'VALUE'])
Upvotes: 1
Views: 389
Reputation: 862601
Use groupby
per columns and aggregate list
s:
df = (df.pivot_table(index=['DECLARANT','PARTNER','PRODUCT'],
columns=['PERIOD'],
values=['KG', 'VALUE'])
.groupby(axis=1, level=0)
.apply(lambda x: x.agg(list, 1)))
print (df)
KG VALUE
DECLARANT PARTNER PRODUCT
FR XC 1001 [20.0, 22.0] [10.0, 12.0]
1002 [21.0, 23.0] [11.0, 13.0]
1003 [nan, 24.0] [nan, 14.0]
Upvotes: 1