Reputation: 1493
I have a multiindex dataframe where there is two dates in the index. each date-combination has values for the columns A,B,C,D.
tradedate deliverydate A B C D
2017-09-15 00:00:00 2017-09-11 00:00:00 31.84 27.61 32.3 46.57
2017-09-18 00:00:00 39 41.33 42.13 51.655
2017-09-25 00:00:00 39.75 40.5 42.89 56.135
2017-10-02 00:00:00 41.25 37.85 43.375 54.725
2017-10-09 00:00:00 46 40.72 47.875 54.475
2017-09-18 00:00:00 2017-09-11 00:00:00 32.04 28.94 34.18 49.295
2017-09-18 00:00:00 40.2 41.615 42.945 50.71
2017-09-25 00:00:00 40 39.55 41.815 54.125
2017-10-02 00:00:00 41.75 37.265 43.99 52.975
2017-10-09 00:00:00 44.75 40.615 48.5 54.285
2017-10-16 00:00:00 51.12 42.875 52.625 54.475
I would like to resolve the multiindex by replacing the deliverydate in level 2 by the position and then creating columns using column name and position.
The positions would look like this:
tradedate position A B C D
2017-09-15 00:00:00 0 31.84 27.61 32.3 46.57
1 39 41.33 42.13 51.655
2 39.75 40.5 42.89 56.135
3 41.25 37.85 43.375 54.725
4 46 40.72 47.875 54.475
2017-09-18 00:00:00 0 32.04 28.94 34.18 49.295
1 40.2 41.615 42.945 50.71
2 40 39.55 41.815 54.125
3 41.75 37.265 43.99 52.975
4 44.75 40.615 48.5 54.285
5 51.12 42.875 52.625 54.475
And the final dataframe should be without multiindex and look like this:
tradedate A_0 A_1 A_2 A_3 A_4 A_5 B_0 … D_4 D_5
2017-09-15 00:00:00 31.84 39 39.75 41.25 46 - 27.61 … 54.475
2017-09-18 00:00:00 32.04 40.2 40 41.75 44.75 51.12 28.94 … 54.285 54.475
Can someone help me with these transformations?
Upvotes: 1
Views: 88
Reputation: 150765
This would do:
new_df = (df.reset_index(level=1, drop=True)
.set_index(df.groupby(level=0).cumcount(), append=True) # this is your step 1
.unstack(level=1)
)
# rename columns
new_df.columns = [f'{x}_{y}' for x,y in new_df.columns]
# reset_index
new_df = new_df.reset_index()
Sample data:
df = (pd.DataFrame({'a':['x']*4+['y']*3,
'b':[8,8,8,9,7,7,7],
'A':[1,2,3,4,5,6,7],
'B':[7,6,5,4,3,2,1]})
.set_index(['a','b'])
)
Output:
a A_0 A_1 A_2 A_3 B_0 B_1 B_2 B_3
0 x 1.0 2.0 3.0 4.0 7.0 6.0 5.0 4.0
1 y 5.0 6.0 7.0 NaN 3.0 2.0 1.0 NaN
Upvotes: 1