Pivot and Groupby with Pandas

Question

I have the following dataframe:

DECLARANT   PARTNER PRODUCT PERIOD  VALUE   KG
FR          XC      1001    200501  10      20
FR          XC      1002    200501  11      21
FR          XC      1001    200502  12      22
FR          XC      1002    200502  13      23
FR          XC      1003    200502  14      24

And I want it transformed into something like this:

DECLARANT   PARTNER     PRODUCT  VALUE      KG
FR          XC          1001     [10,12]    [20,22]
FR          XC          1002     [11,13]    [21,23]
FR          XC          1003     [nan,14]   [nan,24]

That is, the VALUE and KG is pivoted based on PERIOD, and if it doesn't exist in the PERIOD it should be replaced with nan.

I've tried pivot and groupby. But don't seem to get them to work together. This kind-of gives me something but not really what I'm looking for:

df.pivot_table(index=['DECLARANT','PARTNER','PRODUCT'],
                                      columns=['PERIOD'],
                                      values=['KG', 'VALUE'])

jezrael · Accepted Answer

Use groupby per columns and aggregate lists:

df = (df.pivot_table(index=['DECLARANT','PARTNER','PRODUCT'],
                                      columns=['PERIOD'],
                                      values=['KG', 'VALUE'])
         .groupby(axis=1, level=0)
         .apply(lambda x: x.agg(list, 1)))
print (df)
                                     KG         VALUE
DECLARANT PARTNER PRODUCT                            
FR        XC      1001     [20.0, 22.0]  [10.0, 12.0]
                  1002     [21.0, 23.0]  [11.0, 13.0]
                  1003      [nan, 24.0]   [nan, 14.0]

Pivot and Groupby with Pandas

Answers (1)

Related Questions