sriman narayana
sriman narayana

Reputation: 359

Pandas reshaping the dataframe

I want to reshape the data frame

df = 
        cell_name   para1     para1_val   para2      para2_val
0      Cell_1        tch_block  0.45       sdch_block   0.78
1      cell_1        dcr        2.3                           
2      cell_2        tch_block  0.8        sdcch_block    0.9
3      cell_2        dcr        1.9   

And I want output like

           cell_name   parameter   parameter_value
0          cell_1       tch_block     0.45
1          cell_1       sdcch_block   0.78
2          cell_1       dcr            2.3
3          cell_2       tch_block     0.8
4          cell_2       sdcch_block   0.9
5          cell_2       dcr           1.9

Upvotes: 2

Views: 82

Answers (1)

Nader Hisham
Nader Hisham

Reputation: 5414

check the documentation for pivot_table , and melt

the idea here is that we will create a dataframe for para1 and para2 separately then concatenate the result at the end

In [11]:
df.cell_name = df.cell_name.str.lower()

calculate para1 dataframe for para1 column

In [16]:
para1_pivot = df.pivot_table(index='cell_name' , columns='para1' , values='para1_val').reset_index()
para1_pivot
Out[16]:
para1   cell_name   dcr tch_block
0       cell_1      2.3 0.45
1       cell_2      1.9 0.80

In [20]:
para1 = pd.melt(para1_pivot , id_vars= 'cell_name' ,  var_name='parameter' , value_name='parameter_value')
para1
Out[20]:
    cell_name   parameter   parameter_value
0   cell_1          dcr         2.30
1   cell_2          dcr         1.90
2   cell_1          tch_block   0.45
3   cell_2          tch_block   0.80

then do the same exactly with para2

In [19]:
para2_pivot = df.pivot_table(index='cell_name' , columns='para2' , values='para2_val').reset_index()
para2_pivot
Out[19]:
para2   cell_name   sdcch_block sdch_block
0       cell_1          NaN         0.78
1       cell_2          0.9         NaN

In [25]:
para2 = pd.melt(para2_pivot , id_vars= 'cell_name' ,  var_name='parameter' , value_name='parameter_value')
para2
Out[25]:
  cell_name  parameter  parameter_value
0   cell_1  sdcch_block     NaN
1   cell_2  sdcch_block     0.90
2   cell_1  sdch_block      0.78
3   cell_2  sdch_block      NaN

then concatenate the results

In [30]:
pd.concat([para1 , para2[para2.parameter_value.notnull()]])
Out[30]:
cell_name   parameter   parameter_value
0   cell_1  dcr         2.30
1   cell_2  dcr         1.90
2   cell_1  tch_block   0.45
3   cell_2  tch_block   0.80
1   cell_2  sdcch_block 0.90
2   cell_1  sdch_block  0.78

if you want to sort your final frame check sort_values

Upvotes: 2

Related Questions