Reputation: 7607
I have a DataFrame
and I want to transpose it.
import pandas as pd
df = pd.DataFrame({'ID':[111,111,222,222,333,333],'Month':['Jan','Feb','Jan','Feb','Jan','Feb'],
'Employees':[2,3,1,5,7,1],'Subsidy':[20,30,10,15,40,5]})
print(df)
ID Month Employees Subsidy
0 111 Jan 2 20
1 111 Feb 3 30
2 222 Jan 1 10
3 222 Feb 5 15
4 333 Jan 7 40
5 333 Feb 1 5
Desired output:
ID Var Jan Feb
0 111 Employees 2 3
1 111 Subsidy 20 30
0 222 Employees 1 5
1 222 Subsidy 10 15
0 333 Employees 7 1
1 333 Subsidy 40 5
My attempt: I tried using pivot_table()
, but both Employees
& Subsidy
naturally appear in same rows, where as I want them on separate rows.
df.pivot_table(index=['ID'],columns='Month',values=['Employees','Subsidy'])
Employees Subsidy
Month Feb Jan Feb Jan
ID
111 3 2 30 20
222 5 1 15 10
333 1 7 5 40
I tried using transpose()
, but it transposes entire DataFrame
, it seems there is no possibility to transpose by first fixing a column. Any suggestions?
Upvotes: 1
Views: 197
Reputation: 1413
You were on point with your pivot_table
approach. Only thing is you missed stack
and reset_index
:
df.pivot_table(index=['ID'],columns='Month',values=['Employees','Subsidy']).stack(0).reset_index()
Out[42]:
Month ID level_1 Feb Jan
0 111 Employees 3 2
1 111 Subsidy 30 20
2 222 Employees 5 1
3 222 Subsidy 15 10
4 333 Employees 1 7
5 333 Subsidy 5 40
You can change the column name to var
later if it's needed.
Upvotes: 0
Reputation: 863301
You can add DataFrame.rename_axis
for set new column name for first level after pivoting and also None
for avoid Month
column name in final DataFrame, which is reshaped by DataFrame.stack
by first level, last MultiIndex in converted to coumns by DataFrame.reset_index
:
df2 = (df.pivot_table(index='ID',
columns='Month',
values=['Employees','Subsidy'])
.rename_axis(['Var',None], axis=1)
.stack(level=0)
.reset_index()
)
print (df2)
ID Var Feb Jan
0 111 Employees 3 2
1 111 Subsidy 30 20
2 222 Employees 5 1
3 222 Subsidy 15 10
4 333 Employees 1 7
5 333 Subsidy 5 40
Upvotes: 1