deepbrook
deepbrook

Reputation: 2656

Reshaping a table - entries in column to new columns

Given following Data:

        Symbol        Date      Type         Value
518         ZW  2008-01-02        cm  1.204330e+09
519         ZW  2008-01-02   cm_next  1.209600e+09
520         ZW  2008-01-02       p&l  0.000000e+00
521         ZW  2008-01-02  position  0.000000e+00
522         ZW  2008-01-02  rolldate  1.203466e+09
523         ZW  2008-01-02     value  3.114788e+04
524         ZW  2008-01-02      vola  6.256606e+02
1046        ZW  2008-01-03        cm  1.204330e+09
1047        ZW  2008-01-03   cm_next  1.209600e+09
1048        ZW  2008-01-03       p&l  0.000000e+00
1049        ZW  2008-01-03  position  0.000000e+00
1050        ZW  2008-01-03  rolldate  1.203466e+09
1051        ZW  2008-01-03     value  3.202738e+04
1052        ZW  2008-01-03      vola  6.338274e+02
1574        ZW  2008-01-04        cm  1.204330e+09
1575        ZW  2008-01-04   cm_next  1.209600e+09
1576        ZW  2008-01-04       p&l  0.000000e+00
1577        ZW  2008-01-04  position  0.000000e+00
1578        ZW  2008-01-04  rolldate  1.203466e+09
1579        ZW  2008-01-04     value  3.162559e+04
1580        ZW  2008-01-04      vola  6.357563e+02
2102        ZW  2008-01-07        cm  1.204330e+09
2103        ZW  2008-01-07   cm_next  1.209600e+09
2104        ZW  2008-01-07       p&l  0.000000e+00
2105        ZW  2008-01-07  position  0.000000e+00
2106        ZW  2008-01-07  rolldate  1.203466e+09
2107        ZW  2008-01-07     value  3.066630e+04
2108        ZW  2008-01-07      vola  6.381839e+02

I want to reshape this table to the following format:

Symbol | Date | cm | cm_next | rolldate | p&l | position | [etc..]

i.e. All my types are supposed to be columns and contain their respective value for each date.

I've tried df.pivot() & df.unstack() but alas, what I want is beyond their scope and not exactly what I'm looking for, from what I understand.

I could extract the data for each type in the Type column and glue it back together - but this seems like a rather primal approach. Is there a better, more pandaic way to achieve this ?

Upvotes: 1

Views: 22

Answers (1)

jezrael
jezrael

Reputation: 862671

I think you need pivot_table, but data are aggregated by np.mean (default aggfunc=np.mean) with rename_axis (new in pandas 0.18.0) and reset_index:

print df.pivot_table(index=['Symbol','Date'], columns='Type', values='Value')
        .rename_axis(None, axis=1)
        .reset_index()

  Symbol        Date            cm       cm_next  p&l  position      rolldate  \
0     ZW  2008-01-02  1.204330e+09  1.209600e+09  0.0       0.0  1.203466e+09   
1     ZW  2008-01-03  1.204330e+09  1.209600e+09  0.0       0.0  1.203466e+09   
2     ZW  2008-01-04  1.204330e+09  1.209600e+09  0.0       0.0  1.203466e+09   
3     ZW  2008-01-07  1.204330e+09  1.209600e+09  0.0       0.0  1.203466e+09   

      value      vola  
0  31147.88  625.6606  
1  32027.38  633.8274  
2  31625.59  635.7563  
3  30666.30  638.1839  

Upvotes: 1

Related Questions