The Great
The Great

Reputation: 7693

transpose pandas columns with duplicate values

I have a data frame like as shown below

df1 = pd.DataFrame({'Gender':['Male','Male','Male','Male','Female','Female','Female','Female','Male','Male','Male','Male','Female','Female','Female','Female'],
                'Year' :[2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009],
           'rate':[2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6,2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6],
           'Heading':['TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123',
                     'TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456'],
           'target':[31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2,31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2],
            'day_type':['wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend']})

As you can see there are duplicate values in all columns.

I would like to transpose/pivot them to get an output as shown below. Though I tried the below it didn't work.

df1.pivot(index='Year', columns='Heading', values='rate')

I expect my output to be like as shown below where each year is made as a row and all the corresponding entries for that year are made as columns.

Please note I haven't filled in the values as table column structure is more important.

enter image description here

Can you help me, please?

Upvotes: 0

Views: 505

Answers (1)

Ch3steR
Ch3steR

Reputation: 20669

You can try this. You can use df.unstack() here and convert the Multi-Index to single level index using join.

df1 = df1.pivot_table(index=['Year','Gender'],columns='Heading',values='rate').unstack()

df1.columns = ['_'.join(i) for i in df1.columns.tolist()]

df1 
      TDAS3_Female  TDAS3_Male  TNMAB123_Female  TNMAB123_Male  TSAD4_Female  TSAD4_Male  TWQE2_Female  TWQE2_Male
Year
2008           NaN         NaN              6.3            2.3           NaN         NaN           NaN         NaN
2009           NaN         NaN              7.1            3.2           NaN         NaN           2.1         4.5
2010           5.3         5.6              NaN            NaN           NaN         NaN           4.2         6.7
2011           3.6         3.2              NaN            NaN           2.9         3.5           NaN         NaN
2012           NaN         NaN              NaN            NaN           6.2         2.6           NaN         NaN

There are couple of ways to convert Multi-Index to single level. Using df.colums or df.columns.tolist or pd.MultiIndex.to_flat_index

  • ['_'.join(i) for i in df1.columns.tolist()]
  • ['_'.join(i) for i in df1.columns]
  • ['_'.join(i) for i in df1.columns.to_flat_index()]

Upvotes: 1

Related Questions