Reputation: 161
I've been struggling to find a way to reshape my dataframe to my liking. I'm fairly new to Python and not familiar with all the methods of the dataframe. Particularly pivoting. I've read through the docs multiple times and still haven't found a solution.
(data below is random)
My original data pulled into a dataframe looks like this:
ShellSurface S1 S2 ElementHID sx sy sz sxy
0 1 88.340153 -88.340153 144 0 0 0 15.225413
1 1 66.370153 -66.370153 144 0 0 0 21.447455
2 1 74.422513 -74.422513 144 0 0 0 88.114254
3 1 22.324573 -22.324573 144 0 0 0 74.370153
4 2 14.322413 -14.322413 144 0 0 0 11.114425
There are 3 surfaces per element, and the elements are quadrilaterals, so they have 4 separate entries that need to be averaged in the file. I used frame.groupby(['ElementHID','ShellSurface']).mean() to accomplish this.
The new frame:
S1 S2 sx sy sz sxy
ElementHID ShellSurface
144 1 22.310153 -22.310153 0 0 0 21.445778
2 17.114552 -17.114552 0 0 0 11.114452
3 79.370153 -79.370153 0 0 0 19.311443
The problem is I need columns for the data in each surface (appending the surface number to the headers) with the elements as the row. So I need the column format below:
ElementHID S11 S12 S13 sx1 sx2 sx3 sy1 sy2 sy3 sxy1 sxy2 sxy3
The way I've previously done this is making an element class and giving it the attributes of my data columns for each surface, but it does not seem like the most efficient way to do it. I'm hoping there's a way to achieve this in Pandas.
I've also stacked the data, which gives me an easy way to loop through the data for each surface, but I'm still not sure how to use it to reshape the data.
Upvotes: 0
Views: 51
Reputation: 52276
Starting with your grouped data, with the index reset to columns.
df = frame.groupby(['ElementHID','ShellSurface']).mean().reset_index()
You can use pivot_table
to reshape the data. Index defines what variable defines the 'rows' of the table, and columns defines which variable(s) should be pivoted into columns.
In [233]: pivoted = df.pivot_table(index=['ElementHID'], columns=['ShellSurface'])
In [234]: pivoted
Out[234]:
S1 S2 sx sy sz ...
ShellSurface 1 2 1 2 1 2 1 2 1 2
ElementHID
144 62.864348 14.322413 -62.864348 -14.322413 0 0 0 0 0 0
The pivoted table will have a MultiIndex column, you can access groups relatively intuitively, for example:
In [235]: pivoted['S1'][1]
Out[235]:
ElementHID
144 62.864348
Name: 1, dtype: float64
Or, if you want to collapse down to a list of joined column names, you could do something like this.
In [239]: pivoted.columns = [''.join((lvl1, str(lvl2))) for lvl1,lvl2 in pivoted.columns]
In [240]: pivoted
Out[240]:
S11 S12 S21 S22 sx1 sx2 sy1 sy2 ...
ElementHID
144 62.864348 14.322413 -62.864348 -14.322413 0 0 0 0
Upvotes: 1