Reputation: 71
I have a dataframe called Pop that looks like the following (Sorry, this doesn't format very well, but it's a single Index DataFrame (State is the index) with columns for each year. I only put a small sample of the data):
2007 2008 2009
State
Alabama 4637904 4677464 4708708
Is there a way to easily convert this to a MultiIndex DataFrame where the Year becomes a secondary index. The only item is a new field 'Population' that takes the data from the original dataFrame. It would ideally look something like:
Population
State Year
Alabama 2007 4637904
Alabama 2008 4677464
Alabama 2009 4708708
etc...
I've been looking for some kind of code since this seems like something that could be handled in Python. Thank you!
Upvotes: 1
Views: 361
Reputation: 2553
You can use melt to convert columns into rows :
In [10]: df
Out[10]:
state 2007 2008 2009
0 Alabama 4637904 4677464 4708708
1 Maine 1234567 1432765 1432567
2 Florida 19432765 19123456 19765654
In [13]: out = pd.melt(df, id_vars=['state'], var_name=['year'], value_name='population')
In [14]: out
Out[14]:
state year population
0 Alabama 2007 4637904
1 Maine 2007 1234567
2 Florida 2007 19432765
3 Alabama 2008 4677464
4 Maine 2008 1432765
5 Florida 2008 19123456
6 Alabama 2009 4708708
7 Maine 2009 1432567
8 Florida 2009 19765654
And finally setting the index :
In [15]: out.set_index(['state', 'year'])
Out[15]:
population
state year
Alabama 2007 4637904
Maine 2007 1234567
Florida 2007 19432765
Alabama 2008 4677464
Maine 2008 1432765
Florida 2008 19123456
Alabama 2009 4708708
Maine 2009 1432567
Florida 2009 19765654
PS : I used random values for the states population
Upvotes: 1