Reputation: 532

Cannot pivot/unstack a column due to duplicated values that I need them to be there

I am facing a problem that I probably cannot resolve with unstack() or pivot().

I have a df looking like this:

     A      B     C
 0  dj   rock   4.5
 1  dj   rock     6
 2  cd   rock   3.2
 3  cd  opera   4.8
 4  cd  opera     7
 5  dj  opera     9

And I want to format it like this:

     A    rock   opera
 0  dj     4.5     nan
 1  dj       6     nan
 2  cd     3.2     nan
 3  cd     nan     4.8
 4  cd     nan       7
 5  dj     nan       9

If I try to pivot() I get the error of duplicate values.

Do you have any ideas to share?

Upvotes: 2

Answers (2)

not_speshal

Reputation: 23156

You could join the original df to the pivoted df:

>>> df.join(df.pivot(None,"B","C")).drop(["B", "C"], axis=1)
    A  opera  rock
0  dj    NaN   4.5
1  dj    NaN   6.0
2  cd    NaN   3.2
3  cd    4.8   NaN
4  cd    7.0   NaN
5  dj    9.0   NaN

Alternatively, use the cumulative count of column "A" as an additional key to create a unique index

>>> df.assign(D=df.groupby("A").cumcount()) \
      .pivot(["A", "D"], "B", "C") \
      .droplevel(1) \
      .reset_index() \
      .rename_axis(columns=None)

    A  opera  rock
0  cd    NaN   3.2
1  cd    4.8   NaN
2  cd    7.0   NaN
3  dj    NaN   4.5
4  dj    NaN   6.0
5  dj    9.0   NaN

Upvotes: 0

Vishnudev Krishnadas

Reputation: 10960

Use set_index with append=True and then unstack

df.set_index(['A', 'B'], append=True).unstack(2)

Output

         C     
B    opera rock
  A            
0 dj   NaN  4.5
1 dj   NaN  6.0
2 cd   NaN  3.2
3 cd   4.8  NaN
4 cd   7.0  NaN
5 dj   9.0  NaN

Upvotes: 1

Cannot pivot/unstack a column due to duplicated values that I need them to be there

Answers (2)

Related Questions