Reputation: 1647
I’d like to convert a dataframe, df, similar to this one:
PIDM | COURSE | GRADE
1 | MAT1 | B
1 | PHY2 | C
2 | MAT1 | A
2 | MAT2 | B
2 | PHE2 | A
to the following format:
PIDM | MAT1 | PHY2 | MAT2 | PHY 2
1 | B | C | NaN | NaN
2 | A | NaN | B | A
I was assuming I could do something like:
df2 = df.pivot(index='PIDM', columns=‘COURSE’, values = ‘GRADE)
but I receive an error stating that I have duplicate indices. Thank you for your help.
Upvotes: 0
Views: 244
Reputation: 862661
You can use pivot_table
with aggregate function join
:
df2 = df.pivot_table(index='PIDM', columns='COURSE', values = 'GRADE', aggfunc=', '.join)
print (df2)
COURSE MAT1 MAT2 PHE2 PHY2
PIDM
1 B None None C
2 A B A None
Upvotes: 1