AugBar
AugBar

Reputation: 457

DataFrame flattening to columns

Given a pandas dataframe with 2 columns and 5 entries:

    A       B
1  0.1     0.01 
2  0.2     0.02
3  0.3     0.03
4  0.4     0.04
5  0.5     0.05

Is there a way to 'project' the index dimension over the columns ? By that I mean an operation which would result in a serie with labels A_1, A_2, ... A_5, B_1, ... B_5.

         vals
A_1       0.1
A_2       0.2
A_3       0.3
A_4       0.4
A_5       0.5
B_1      0.01
B_2      0.02
B_3      0.03
B_4      0.04
B_5      0.05

I have found a way to do that iteratively but I am pretty sure pandas has a function for it.

Upvotes: 0

Views: 45

Answers (1)

bunji
bunji

Reputation: 5213

df.stack() will get you most of the way there but you'll need to do some index manipulation if you want to get those exact indices ('A_1', 'A_2', etc.)

s = df.stack()

this gives the series:

0  A    0.998446
   B    0.847224
1  A    0.776747
   B    0.863142
2  A    0.343800
   B    0.056423
3  A    0.194158
   B    0.178408
4  A    0.488775
   B    0.019145

You can then get the values from the multiindex (as tuples) and get them into the format you want as follows:

new_index = ['_'.join((i[1], str(i[0]+1))) for i in  s.index.get_values()]

This flips the first and second values in each tuple, adds 1 to the integer (since indices normally start at 0) and joins the tuple with an underscore.

Then you can assign this list as the new index and sort the series by its index to get the ordering you specified.

s.index = new_index  
s = s.sort_index()

Which results in the series s:

A_1    0.998446
A_2    0.776747
A_3    0.343800
A_4    0.194158
A_5    0.488775
B_1    0.847224
B_2    0.863142
B_3    0.056423
B_4    0.178408
B_5    0.019145

Upvotes: 2

Related Questions