Reputation: 2983
I have a data frame with some outer keys generated by pandas concat
function which looks like this
ID ratio log_q
L-D 0 A5A614 2.51803 2.09644
1 P00370 3.76811 5.92205
2 P00393 1.74254 3.74875
3 P00452-2 3.37144 6.13225
4 P00547 3.06521 5.55512
5 P00561 3.02943 5.58718
ID ratio log_q
M-D 0 A5A614 2.51803 2.09644
1 P00370 3.76811 5.92205
2 P00393 1.74254 3.74875
3 P00452-2 3.37144 6.13225
4 P00547 3.06521 5.55512
5 P00561 3.02943 5.58718
ID ratio log_q
M3-D 0 A5A614 2.51803 2.09644
1 P00370 3.76811 5.92205
2 P00393 1.74254 3.74875
3 P00452-2 3.37144 6.13225
4 P00547 3.06521 5.55512
5 P00561 3.02943 5.58718
I would like to use concat
again to generate a new dataframe, which takes the ratio column for all keys ('L-D', 'M-D', 'M3-D') and uses these keys as names for the new columns.
In addition, the new dataframe should be aligned for matching 'ID's in the following way:
L-D M-D M3-D
A5A614 2.51803 1.13223 2.64402
P00393 3.76811 1.97461 3.34965
P00547 1.74254 2.70024 2.3655
...
When I use
pd.concat([df.ix['L-D']['ratio'], df.ix['M-D']['ratio'], df.ix['M3-D']['ratio']],
axis=1, levels=("L-D","M-D","M3-D"))
or
pd.concat([df.ix['L-D']['ratio'], df.ix['M-D']['ratio'], df.ix['M3-D']['ratio']],
axis=1, names=("L-D","M-D","M3-D"))
I can create a data frame but the result looks like this:
ratio ratio ratio
0 2.51803 1.13223 2.64402
1 3.76811 1.97461 3.34965
2 1.74254 2.70024 2.3655
Apparently, the names/levels are not used and it just takes the numerical index but not the 'ID'
Upvotes: 1
Views: 57
Reputation: 862471
I think you need add parameter keys
to concat
not levels
:
#remove first level and append column ID:
df = df.reset_index(level=1, drop=True).set_index('ID', append=True)
print pd.concat([df.ix['L-D']['ratio'], df.ix['M-D']['ratio'], df.ix['M3-D']['ratio']],
axis=1,
keys=["L-D","M-D","M3-D"])
L-D M-D M3-D
ID
A5A614 2.51803 2.51803 2.51803
P00370 3.76811 3.76811 3.76811
P00393 1.74254 1.74254 1.74254
P00452-2 3.37144 3.37144 3.37144
P00547 3.06521 3.06521 3.06521
P00561 3.02943 3.02943 3.02943
But I think better is use pd.pivot
with get_level_values
:
print pd.pivot(index=df.ID, columns=df.index.get_level_values(0), values=df.ratio)
L-D M-D M3-D
ID
A5A614 2.51803 2.51803 2.51803
P00370 3.76811 3.76811 3.76811
P00393 1.74254 1.74254 1.74254
P00452-2 3.37144 3.37144 3.37144
P00547 3.06521 3.06521 3.06521
P00561 3.02943 3.02943 3.02943
Upvotes: 1