Á. Garzón
Á. Garzón

Reputation: 365

An optimal way of getting a correct reshape of that dataframe

I've the following dataframe:

df = 
    c   f   V   E
0   M   5  32  22
1   M   7  45  40
2   R   7  42  36
3   R   9  41  38
4   R   3  28  24

And I want a result like this, in which the values ​​of column 'f' are my new columns, and my new indexes are a combination of column 'c' and the rest of columns in the dataframe (the order of rows doesn't matter):

df_result = 
        3   5    7    9
 V(M) NaN  32   45  NaN
 E(M) NaN  22   40  NaN
 V(R) 28  NaN   42   41
 E(R) 24  NaN   36   38

Currently, my code is:

df_result = pd.concat([df.pivot('c','f',col).rename(index = {e: col + '(' + e + ')' for e in df.pivot('c','f',col).index}) for col in [e for e in df.columns if e not in ['c','f']]])

With that code I'm getting:

df_result = 
f      3   5   7   9
c
E(M)  NaN  22  40  NaN
E(R)   24 NaN  36   38
V(M)  NaN  32  45  NaN
V(R)   28 NaN  42   41

I think it's a valid result, however, I don't know if there is a way to get exactly my desire result or, at least, a better way to get what I am already getting.

Thanks you very much in advance.

Upvotes: 1

Views: 36

Answers (2)

BENY
BENY

Reputation: 323326

Check with pivot_table

s=pd.pivot_table(df,index='c',columns='f',values=['V','E']).stack(level=0).sort_index(level=1)
s.index=s.index.map('{0[1]}({0[0]})'.format)
s
Out[95]: 
f        3     5     7     9
E(M)   NaN  22.0  40.0   NaN
E(R)  24.0   NaN  36.0  38.0
V(M)   NaN  32.0  45.0   NaN
V(R)  28.0   NaN  42.0  41.0

Upvotes: 2

ALollz
ALollz

Reputation: 59579

To get the table, this is .melt + .pivot_table

df_result = df.melt(['f', 'c']).pivot_table(index=['variable', 'c'], columns='f')

Then we can clean up the naming:

df_result = df_result.rename_axis([None, None], 1)
df_result.columns = [y for _,y in df_result.columns]

df_result.index = [f'{x}({y})' for x,y in df_result.index]
# Python 2.:      ['{0}({1})'.format(*x) for x in df_result.index]

Output:

         3     5     7     9
E(M)   NaN  22.0  40.0   NaN
E(R)  24.0   NaN  36.0  38.0
V(M)   NaN  32.0  45.0   NaN
V(R)  28.0   NaN  42.0  41.0

You might consider keeping the MultiIndex instead of flattening to new strings, as it can be simpler for certain aggregations.

Upvotes: 3

Related Questions