How to pivot a dataframe to collapse multiple rows into one

Question

I have a dataframe that has 4 columns: UID, Date, Type, and Value as such:

UID Date        Type    Value
50  2020-12-01  3       15
50  2020-12-01  2       13
50  2020-12-01  1       50
135 2020-12-02  2       0
135 2020-12-02  1       12
50  2020-12-02  4       100
50  2020-12-02  2       25
50  2020-12-02  3       15  
50  2020-12-02  1       40

For a given user on a given date, each type that appears is unique (i.e there will never be two entries for Type X for UID Y on Day Z). Type is an integer between 1 and 4, inclusive.

I would like to transform this into a dataframe that has a column for each type, the value in the corresponding column, and reduces all rows to having a unique UID/Date pair as such, with missing type/value pairs as nan or 0:

UID Date        Type_1  Type_2  Type_3  Type_4
50  2020-12-01  50      13      15      nan
135 2020-12-02  12      0       nan     nan
50  2020-12-02  40      25      15      100

I've been tinkering with pivot but can't quite get it, any assistance would be much appreciated!

piterbarg · Accepted Answer

It is simply

df.pivot(index = ['UID','Date'], values = 'Value', columns = 'Type').add_prefix('Type_')

output

    Type        Type_1  Type_2  Type_3  Type_4
UID Date                
50  2020-12-01  50.0    13.0    15.0    NaN
    2020-12-02  40.0    25.0    15.0    100.0
135 2020-12-02  12.0    0.0     NaN     NaN

you can stick reset_index() at the end of the expression if you do not like those columns in the index

How to pivot a dataframe to collapse multiple rows into one

Answers (1)

Related Questions