Regenschein
Regenschein

Reputation: 1502

(possibly grouped) Row Values to Columns

Let's say, after some groupby operation I have a dataframe like this:

data = pd.DataFrame(columns=['Key', 'Subkey', 'Value'])
data.loc[0] = ['foo1', 'bar1', 20]
data.loc[1] = ['foo1', 'bar2', 10]
data.loc[2] = ['foo1', 'bar3', 5]
data.loc[3] = ['foo2', 'bar1', 50]
data.loc[4] = ['foo2', 'bar2', 100]
data.loc[5] = ['foo2', 'bar3', 50]

What I then have is a dataframe that looks like this:

|Key |Subkey | Value |
+----+-------+-------+
|foo1|bar1   |20     |
|foo1|bar2   |10     |
|foo1|bar3   |5      |
|foo2|bar1   |50     |
|foo2|bar2   |100    |
|foo2|bar3   |50     |

What I would like to have is a new dataframe where the subkey is a new column, containing the same value as in the grouped frame above, like:

|Key |bar1 |bar2  |bar3  |
+----+-----+------+------+
|foo1| 20  |  10  | 5    |
|foo2| 50  | 100  | 50   |

Is there a one-line solution to this, or do I need to transform the dataframe programmatically?

Upvotes: 1

Views: 48

Answers (1)

jezrael
jezrael

Reputation: 862511

You can use pivot:

print (data.pivot(index='Key', columns='Subkey', values='Value'))
Subkey  bar1   bar2  bar3
Key                      
foo1    20.0   10.0   5.0
foo2    50.0  100.0  50.0

Then you can cast float values to int, reset_index and remove column names Subkey:

print (data.pivot(index='Key', columns='Subkey', values='Value')
           .astype(int)
           .reset_index()
           .rename_axis(None, axis=1))

    Key  bar1  bar2  bar3
0  foo1    20    10     5
1  foo2    50   100    50

Upvotes: 2

Related Questions