Reputation: 7693
I have a data frame like as shown below
df1 = pd.DataFrame({'Gender':['Male','Male','Male','Male','Female','Female','Female','Female','Male','Male','Male','Male','Female','Female','Female','Female'],
'Year' :[2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009],
'rate':[2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6,2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6],
'Heading':['TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123',
'TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456'],
'target':[31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2,31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2],
'day_type':['wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend']})
As you can see there are duplicate values in all columns.
I would like to transpose/pivot them to get an output as shown below. Though I tried the below it didn't work.
df1.pivot(index='Year', columns='Heading', values='rate')
I expect my output to be like as shown below where each year is made as a row and all the corresponding entries for that year are made as columns.
Please note I haven't filled in the values as table column structure is more important.
Can you help me, please?
Upvotes: 0
Views: 505
Reputation: 20669
You can try this. You can use df.unstack()
here and convert the Multi-Index to single level index using join
.
df1 = df1.pivot_table(index=['Year','Gender'],columns='Heading',values='rate').unstack()
df1.columns = ['_'.join(i) for i in df1.columns.tolist()]
df1
TDAS3_Female TDAS3_Male TNMAB123_Female TNMAB123_Male TSAD4_Female TSAD4_Male TWQE2_Female TWQE2_Male
Year
2008 NaN NaN 6.3 2.3 NaN NaN NaN NaN
2009 NaN NaN 7.1 3.2 NaN NaN 2.1 4.5
2010 5.3 5.6 NaN NaN NaN NaN 4.2 6.7
2011 3.6 3.2 NaN NaN 2.9 3.5 NaN NaN
2012 NaN NaN NaN NaN 6.2 2.6 NaN NaN
There are couple of ways to convert Multi-Index to single level. Using df.colums
or df.columns.tolist
or pd.MultiIndex.to_flat_index
['_'.join(i) for i in df1.columns.tolist()]
['_'.join(i) for i in df1.columns]
['_'.join(i) for i in df1.columns.to_flat_index()]
Upvotes: 1