An optimal way of getting a correct reshape of that dataframe

Question

I've the following dataframe:

df = 
    c   f   V   E
0   M   5  32  22
1   M   7  45  40
2   R   7  42  36
3   R   9  41  38
4   R   3  28  24

And I want a result like this, in which the values of column 'f' are my new columns, and my new indexes are a combination of column 'c' and the rest of columns in the dataframe (the order of rows doesn't matter):

df_result = 
        3   5    7    9
 V(M) NaN  32   45  NaN
 E(M) NaN  22   40  NaN
 V(R) 28  NaN   42   41
 E(R) 24  NaN   36   38

Currently, my code is:

df_result = pd.concat([df.pivot('c','f',col).rename(index = {e: col + '(' + e + ')' for e in df.pivot('c','f',col).index}) for col in [e for e in df.columns if e not in ['c','f']]])

With that code I'm getting:

df_result = 
f      3   5   7   9
c
E(M)  NaN  22  40  NaN
E(R)   24 NaN  36   38
V(M)  NaN  32  45  NaN
V(R)   28 NaN  42   41

I think it's a valid result, however, I don't know if there is a way to get exactly my desire result or, at least, a better way to get what I am already getting.

Thanks you very much in advance.

ALollz · Accepted Answer

To get the table, this is .melt + .pivot_table

df_result = df.melt(['f', 'c']).pivot_table(index=['variable', 'c'], columns='f')

Then we can clean up the naming:

df_result = df_result.rename_axis([None, None], 1)
df_result.columns = [y for _,y in df_result.columns]

df_result.index = [f'{x}({y})' for x,y in df_result.index]
# Python 2.:      ['{0}({1})'.format(*x) for x in df_result.index]

Output:

         3     5     7     9
E(M)   NaN  22.0  40.0   NaN
E(R)  24.0   NaN  36.0  38.0
V(M)   NaN  32.0  45.0   NaN
V(R)  28.0   NaN  42.0  41.0

You might consider keeping the MultiIndex instead of flattening to new strings, as it can be simpler for certain aggregations.

An optimal way of getting a correct reshape of that dataframe

Answers (2)

Output:

Related Questions