Pivoting a Pandas dataframe while deduplicating additional columns

Question

Similar to the documentation example, I want to pivot the following dataframe:

  foo extra bar  baz
0 one     x   A    1
1 one     x   B    2
2 one     x   C    3
3 two     y   A    4
4 two     y   B    5
5 two     y   C    6

The result should be

     extra A  B  C

one      x 1  2  3
two      y 4  5  6

Can this be done in a shorter way than

splitting the extra column off before pivoting
deduplicating it separately
merging it back to the pivoted data?

(I expected the pivot command to be able to do this, but my tries failed.)

Here's the code for the dataframe to play with it:

df = pd.DataFrame({'foo': ['one','one','one','two','two','two'],
                   'extra': ['x','x','x','y','y','y'],
                   'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
                   'baz': [1, 2, 3, 4, 5, 6]})

Zero · Accepted Answer

Use set_index and unstack

In [2087]: df.set_index(['foo', 'extra', 'bar'])['baz'].unstack().reset_index()
Out[2087]:
bar  foo extra  A  B  C
0    one     x  1  2  3
1    two     y  4  5  6

Pivoting a Pandas dataframe while deduplicating additional columns

Answers (2)

Related Questions