J_Heads
J_Heads

Reputation: 489

Reshape a column into multiple columns

Im working with a df that looks like this:

Year Age Income
1945  1   234
1945  1   65
1945  2   234
...
1946  1   34
...
2015  5   34

I'd like to transform it to something like this, where income values fill the df and age is the index.

Age 1955 1956 1957 ... 2015
 0   
 1
 2 
 3
...
99

Upvotes: 1

Views: 1154

Answers (2)

piRSquared
piRSquared

Reputation: 294546

If your ['Year', 'Age'] combinations were unique, you'd just need set_index and unstack

df.set_index(['Year', 'Age']).Income.unstack('Year')

But they are not!
So, you have to decide how to aggregate the groups of combinations.
Below is an example of using sum

df.groupby(['Year', 'Age']).Income.sum().unstack('Year', fill_value=0)

Year  1945  1946  2015
Age                   
1      299    34     0
2      234     0     0
5        0     0    34

other aggregations

df.groupby(['Year', 'Age']).Income.first().unstack('Year', fill_value=0)
df.groupby(['Year', 'Age']).Income.last().unstack('Year', fill_value=0)
df.groupby(['Year', 'Age']).Income.mean().unstack('Year', fill_value=0)

pivot_table
pivot_table <-> pivot is analogous to set_index <-> groupby
Meaning, pivot_table handles aggregation.

df.pivot_table('Income', 'Age', 'Year', 'sum', fill_value=0)
df.pivot_table('Income', 'Age', 'Year', 'first', fill_value=0)
df.pivot_table('Income', 'Age', 'Year', 'last', fill_value=0)
df.pivot_table('Income', 'Age', 'Year', 'mean', fill_value=0)

Upvotes: 3

Metropolis
Metropolis

Reputation: 2128

This looks like you'd like to pivot your table. Pandas documentation is quite good.

import pandas as pd
df = pd.DataFrame([[1945, 1, 234],[1946, 1, 65],[1945, 2, 234]])
df.columns = ['year', 'age', 'income']
df.pivot(index='age', columns='year', values='income')

Which looks like:

year  1945 1946
age
1     234  65
2     234  None

Upvotes: 0

Related Questions