chitown88
chitown88

Reputation: 28565

python - melt / reshape using multiple columns

I've used melt to do this before, but only one column. How do you go about reshaping or melting on multiple columns? I'm thinking it's not necessarily reshape or melt, as I'm just duplicating a row, then switching values in h and v columns. My thought is to use df.iterrows() to just do this, but with a big dataset, thinking there's a better way to do it, just not sure how. I'm trying to go from:

EDIT: Theres MULTIPLE columns in between the ...

gid   h     seas    ...    v
1     ATL   2000    ...    SF
2     CLE   2000    ...    JAC
3     DAL   2000    ...    PHI
4     GB    2000    ...    NYJ
...   ...   ...     ...    ...
7000  GB    2018    ...    CHI
...   ...   ...     ...    ...

into this:

gid team    seas    ...    opp_team ...   home_away
1   ATL     2000    ...    SF       ...   Home
1   SF      2000    ...    ATL      ...   Away
2   CLE     2000    ...    JAC      ...   Home
2   JAC     2000    ...    CLE      ...   Away
3   DAL     2000    ...    PHI      ...   Home
3   PHI     2000    ...    DAL      ...   Away
4   GB      2000    ...    NYJ      ...   Home
4   NYJ     2000    ...    GB       ...   Away
... ...     ...     ...    ...      ...   ...
... ...     ...     ...    ...

Upvotes: 1

Views: 228

Answers (1)

jezrael
jezrael

Reputation: 862591

Use difference for all columns without h and v to parameter id_vars of melt:

cols = df.columns.difference(['h','v'])
df = df.melt(id_vars=cols, 
             value_vars=['h','v'], 
             var_name='home_away',
             value_name='team')

Then change ordering by sort_values and swap values between teams with selecting by swaping index values:

df = df.sort_values('gid').reset_index(drop=True)
idx = [i for y, x in zip(df.index[::2], df.index[1::2]) for i in (x, y)]
#alternative non loop solution
#idx = np.vstack([df.index[1::2], df.index[::2]]).T.ravel()

df['opp_team'] = df.loc[idx, 'team'].values
print (df)
   gid  seas home_away team opp_team
0    1  2000         h  ATL       SF
1    1  2000         v   SF      ATL
2    2  2000         h  CLE      JAC
3    2  2000         v  JAC      CLE
4    3  2000         h  DAL      PHI
5    3  2000         v  PHI      DAL
6    4  2000         h   GB      NYJ
7    4  2000         v  NYJ       GB

Upvotes: 2

Related Questions