dawson645
dawson645

Reputation: 43

Melt and pivot a dataframe in Python?

I'm working with a publicly available election data set that I've imported into Pandas as a df:

    fips_code   county              total_2008  dem_2008    gop_2008     oth_2008   total_2012  dem_2012    gop_2012    oth_2012    total_2016  dem_2016    gop_2016    oth_2016
0   26041       Delta County        19064       9974        8763        327         18043       8330        9533        180         18467       6431        11112       924
1   48295       Lipscomb County     1256        155         1093        8           1168        119         1044        5           1322        135         1159        28
2   1127        Walker County       28652       7420        20722       510         28497       6551        21633       313         29243       4486        24208       549

I would like to get to something like this:

fips_code   county          total   dem     gop     oth year
26041       Delta County    19064   9974    8763    327 2008
48295       Lipscomb County 1256    155     1093    8   2008
1127        Walker County   28652   7420    20722   510 2008

I've searched around and found something similar, Stack and Pivot Dataframe in Python, but I can't figure out how to apply that to my question.

I managed to do a melt of the df:

In [86]:
df_melt = pd.melt(df, id_vars=['fips_code', 'county'], value_name='num_votes')
df_melt.head()

Out [86]:
fips_code       county              variable    num_votes
0   26041       Delta County        total_2008  19064
1   48295       Lipscomb County     total_2008  1256
2   1127        Walker County       total_2008  28652
3   48389       Reeves County       total_2008  3077
4   56017       Hot Springs County  total_2008  2546

This is where I get stuck because I can't figure out if it's a multi-step process that starts with using the melt, or if there's a straightforward way to get there from the initial df I have. I'm not even sure what functions I should be using, but it looks like it involves pivot/stack/unstack? Any help is much appreciated.

Upvotes: 4

Views: 547

Answers (2)

sammywemmy
sammywemmy

Reputation: 28644

One option to abstract the reshaping process is with
pivot_longer from pyjanitor:

# pip install pyjanitor
import pandas as pd
import janitor

df.pivot_longer(
    index = ['fips_code', 'county'], 
    names_to = ('.value', 'year'), 
    names_sep = '_')

   fips_code           county  year  total   dem    gop  oth
0      26041     Delta County  2008  19064  9974   8763  327
1      48295  Lipscomb County  2008   1256   155   1093    8
2       1127    Walker County  2008  28652  7420  20722  510
3      26041     Delta County  2012  18043  8330   9533  180
4      48295  Lipscomb County  2012   1168   119   1044    5
5       1127    Walker County  2012  28497  6551  21633  313
6      26041     Delta County  2016  18467  6431  11112  924
7      48295  Lipscomb County  2016   1322   135   1159   28
8       1127    Walker County  2016  29243  4486  24208  549

No need to explicitly name the prefixes; the .value identifies the prefixes, pulls them out, and sets them as headers, while the rest are transposed into the year column. The dtype of the year column can also be changed to numeric efficiently, via the names_transform parameter:

df.pivot_longer(
    index = ['fips_code', 'county'], 
    names_to = ('.value', 'year'), 
    names_transform = {'year' : int},
    names_sep = '_')

Upvotes: 0

BENY
BENY

Reputation: 323226

This is wide to long question again .wide_to_long

pd.wide_to_long(df,['total','dem','gop','oth'],i=['fips_code','county'],j='Year',sep='_').reset_index()
Out[28]: 
   fips_code          county  Year  total   dem    gop  oth
0      26041     DeltaCounty  2008  19064  9974   8763  327
1      26041     DeltaCounty  2012  18043  8330   9533  180
2      26041     DeltaCounty  2016  18467  6431  11112  924
3      48295  LipscombCounty  2008   1256   155   1093    8
4      48295  LipscombCounty  2012   1168   119   1044    5
5      48295  LipscombCounty  2016   1322   135   1159   28
6       1127    WalkerCounty  2008  28652  7420  20722  510
7       1127    WalkerCounty  2012  28497  6551  21633  313
8       1127    WalkerCounty  2016  29243  4486  24208  549

Upvotes: 4

Related Questions