Reputation: 494
I have dataset in the following format:
df = pd.DataFrame({'x':[1,2,3], 'y':[10,20,30], 'v1':[3,2,3] , 'v2':[13,25,31] })
>> v1 v2 x y
3 13 1 10
2 25 2 20
3 31 3 30
Setting the index column with x, I want to flatten the data combining v1 and v2 (V), The expected output is like:
>> x y V
1 10 3
1 10 13
2 20 2
2 20 25
3 30 3
3 30 31
And again bringing to the original format of df. I tried reshaping using stack and unstack, but I couldn't get it the way, which I was expecting.
Many Thanks!
Upvotes: 6
Views: 4753
Reputation: 879411
pd.lreshape
can reformat wide data to long format:
In [55]: pd.lreshape(df, {'V':['v1', 'v2']})
Out[57]:
x y V
0 1 10 3
1 2 20 2
2 3 30 3
3 1 10 13
4 2 20 25
5 3 30 31
lreshape
is an undocumented "experimental" feature. To learn more about lreshape
see help(pd.lreshape)
.
If you need reversible operations, use jezrael's pd.melt
solution to go from wide to long format, and use pivot_table
to go from long to wide format:
In [72]: melted = pd.melt(df, id_vars=['x', 'y'], value_name='V'); melted
Out[72]:
x y variable V
0 1 10 v1 3
1 2 20 v1 2
2 3 30 v1 3
3 1 10 v2 13
4 2 20 v2 25
5 3 30 v2 31
In [74]: df2 = melted.pivot_table(index=['x','y'], columns=['variable'], values='V').reset_index(); df2
Out[74]:
variable x y v1 v2
0 1 10 3 13
1 2 20 2 25
2 3 30 3 31
Notice that you must hang on to the variable
column if you wish to return to df2
. Also keep in mind that it is more efficient to simply retain a reference to df
than to recompute it using melted
and pivot_table
.
Upvotes: 5
Reputation: 862591
You can use stack
with set_index
. Last drop
column level_2
:
print (df.set_index(['x','y']).stack().reset_index(name='V').drop('level_2', axis=1))
x y V
0 1 10 3
1 1 10 13
2 2 20 2
3 2 20 25
4 3 30 3
5 3 30 31
Another solution with melt
and sort_values
:
print (pd.melt(df, id_vars=['x','y'], value_name='V')
.drop('variable', axis=1)
.sort_values('x'))
x y V
0 1 10 3
3 1 10 13
1 2 20 2
4 2 20 25
2 3 30 3
5 3 30 31
Upvotes: 3