Reputation: 28565
I've used melt to do this before, but only one column. How do you go about reshaping or melting on multiple columns? I'm thinking it's not necessarily reshape or melt, as I'm just duplicating a row, then switching values in h
and v
columns. My thought is to use df.iterrows()
to just do this, but with a big dataset, thinking there's a better way to do it, just not sure how. I'm trying to go from:
EDIT: Theres MULTIPLE columns in between the ...
gid h seas ... v
1 ATL 2000 ... SF
2 CLE 2000 ... JAC
3 DAL 2000 ... PHI
4 GB 2000 ... NYJ
... ... ... ... ...
7000 GB 2018 ... CHI
... ... ... ... ...
into this:
gid team seas ... opp_team ... home_away
1 ATL 2000 ... SF ... Home
1 SF 2000 ... ATL ... Away
2 CLE 2000 ... JAC ... Home
2 JAC 2000 ... CLE ... Away
3 DAL 2000 ... PHI ... Home
3 PHI 2000 ... DAL ... Away
4 GB 2000 ... NYJ ... Home
4 NYJ 2000 ... GB ... Away
... ... ... ... ... ... ...
... ... ... ... ...
Upvotes: 1
Views: 228
Reputation: 862591
Use difference
for all columns without h
and v
to parameter id_vars
of melt
:
cols = df.columns.difference(['h','v'])
df = df.melt(id_vars=cols,
value_vars=['h','v'],
var_name='home_away',
value_name='team')
Then change ordering by sort_values
and swap values between teams with selecting by swaping index values:
df = df.sort_values('gid').reset_index(drop=True)
idx = [i for y, x in zip(df.index[::2], df.index[1::2]) for i in (x, y)]
#alternative non loop solution
#idx = np.vstack([df.index[1::2], df.index[::2]]).T.ravel()
df['opp_team'] = df.loc[idx, 'team'].values
print (df)
gid seas home_away team opp_team
0 1 2000 h ATL SF
1 1 2000 v SF ATL
2 2 2000 h CLE JAC
3 2 2000 v JAC CLE
4 3 2000 h DAL PHI
5 3 2000 v PHI DAL
6 4 2000 h GB NYJ
7 4 2000 v NYJ GB
Upvotes: 2