oldselflearner1959
oldselflearner1959

Reputation: 653

How do I pivot my dataframe multiple times in pandas while creating a new column merging multiple columns?

I find this a rather complex challenge, as I need to merge multiple columns in a dataframe together and then pivot the table multiple times (I think).

So the provided input is this dataframe which I have filtered out:

                       name      X      Y
0      Mathematics House AB  0.123  0.111
2111   Physical Science Hut  0.124  0.112
9412   Primary Education LO  0.125  0.113
1234   Tertiary Universitas  0.126  0.114
12411  Principle of Physics  0.127  0.115
12373  Chemical Industry A1  0.128  0.116

and the output should be something that looks like this:

                  label  Z      
   Mathematics House AB  X  0.123
                         Y  0.111
   Physical Science Hut  X  0.124
                         Y  0.112
   Primary Education LO  X  0.125
                         Y  0.113
   Tertiary Universitas  X  0.126
                         Y  0.114
   Principle of Physics  X  0.127
                         Y  0.115
   Chemical Industry A1  X  0.128
                         Y  0.116

Where Z stands for a new column that has yet to be created. I'm currently using a very hacky technique that takes some columns as numpy arrays and try to reconstruct it. And the results aren't pretty and what is expected. Is there a way to manipulate the dataframe directly without using numpy? It seems like a tool for pivoting multiple times. The current method I have is df.pivot(index='name', columns='Z').T.unstack().T, where I previously let df['Z'] = '' -- very ugly and hacky, indeed, and it doesn't get what I want to present in.

Upvotes: 1

Views: 45

Answers (1)

BENY
BENY

Reputation: 323316

This is stack not pivot

df.set_index('name').stack()
Out[186]: 
name                  
MathematicsHouseAB   X    0.123
                     Y    0.111
PhysicalScienceHut   X    0.124
                     Y    0.112
PrimaryEducationLO   X    0.125
                     Y    0.113
TertiaryUniversitas  X    0.126
                     Y    0.114
PrincipleofPhysics   X    0.127
                     Y    0.115
ChemicalIndustryA1   X    0.128
                     Y    0.116
dtype: float64

EDIT:

df=df.set_index('name').stack()
df.index.names=['name', 'Z']
df
Out[263]: 
                           0
name                Z       
MathematicsHouseAB  X  0.123
                    Y  0.111
PhysicalScienceHut  X  0.124
                    Y  0.112
PrimaryEducationLO  X  0.125
                    Y  0.113
TertiaryUniversitas X  0.126
                    Y  0.114
PrincipleofPhysics  X  0.127
                    Y  0.115
ChemicalIndustryA1  X  0.128
                    Y  0.116

Upvotes: 1

Related Questions