Reputation: 327
I want to pivot a dataframe like:
dim1 Value_V Value_y instance
0 A_1 50.000000 0 instance200
1 A_2 6500.000000 1 instance200
2 A_3 50.000000 0 instance200
3 A_4 4305.922313 1 instance200
Into a dataframe with hierarchical columns like that:
A_1 A_2 A_3 .....
Value_V Value_y Value_V Value_y Value_V Value_y
instance200 50 0 6500 1 50 0
I tried df = df.pivot(index = "instance", columns = "dim1")
, but it will only give me a frame like that:
Value_V Value_y
A_1 A_2 A_3 .... A_1 A_2 A_3 ....
instance200 50 6500 50 0 1 0
How can i change the hierarchy of the columns?
Upvotes: 4
Views: 1783
Reputation: 364
I'd like to add that the prior answers are outdated.
df = df.swaplevel(0, 1, axis=1) # Swaps level as desired
df.columns = df.columns.sortlevel(0)[0] # Orders level 0, slices to the cols
You can find more documentation on this method here: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.sortlevel.html. It allows a user to directly sort individual levels of a pandas.MultiIndex
.
Upvotes: 0
Reputation: 11
I have been battling this problem for a long time. My job requires me to handle large pivot_tables, where there are several dozen indexes and a bit more values. The last, most convenient solution in terms of versatility is this:
def pivot_fix(df):
df = (df.reset_index().T.reset_index(level=0).T.reset_index(drop=True).
reset_index(drop=True).reset_index(drop=True).T.reset_index().T)
df.iloc[0, :df.iloc[0, :].isna().sum()] = df.iloc[1, :df.iloc[0, :].isna().sum()]
df.columns = df.iloc[0]
df.drop(df.index[0:2], inplace=True)
return(df)
using it like that: df = (df.pivot_table(index=['location_id', 'place_name', 'address'], columns='day', values='sum')
Upvotes: 0
Reputation: 20553
What you need is reorder_levels and then sort the columns, like this:
import pandas as pd
df = pd.read_clipboard()
df
Out[8]:
dim1 Value_V Value_y instance
0 A_1 50.000000 0 instance200
1 A_2 6500.000000 1 instance200
2 A_3 50.000000 0 instance200
3 A_4 4305.922313 1 instance200
In [9]:
df.pivot('instance', 'dim1').reorder_levels([1, 0], axis=1).sort(axis=1)
Out[9]:
dim1 A_1 A_2 A_3 A_4
Value_V Value_y Value_V Value_y Value_V Value_y Value_V Value_y
instance
instance200 50 0 6500 1 50 0 4305.922313 1
Upvotes: 3
Reputation: 327
I figured it out by myself:
df = df.swaplevel(0,1,axis = 1).sort(axis = 1)
will do
Upvotes: 5