James Eaves
James Eaves

Reputation: 1647

How to convert Panda's columns into an index and a header when index column has duplicates

I’d like to convert a dataframe, df, similar to this one:

PIDM            | COURSE          | GRADE
1               | MAT1            | B
1               | PHY2            | C
2               | MAT1            | A
2               | MAT2            | B
2               | PHE2            | A

to the following format:

PIDM     |  MAT1      | PHY2    |  MAT2  | PHY 2  
1        |    B       |    C    |  NaN   |   NaN
2        |    A       |    NaN  |  B     |   A

I was assuming I could do something like:

df2 = df.pivot(index='PIDM', columns=‘COURSE’, values = ‘GRADE)

but I receive an error stating that I have duplicate indices. Thank you for your help.

Upvotes: 0

Views: 244

Answers (1)

jezrael
jezrael

Reputation: 862661

You can use pivot_table with aggregate function join:

df2 = df.pivot_table(index='PIDM', columns='COURSE', values = 'GRADE', aggfunc=', '.join)
print (df2)
COURSE MAT1  MAT2  PHE2  PHY2
PIDM                         
1         B  None  None     C
2         A     B     A  None

Upvotes: 1

Related Questions