Reputation: 2328
I have a dataframe like this:
data = {'teamid': [1, 2, 3, 4], 'gameid': [1, 1, 2, 2], 'rebounds': [20, 35, 43, 15]}
game_df = pd.DataFrame(data=data)
print(game_df)
teamid gameid rebounds
0 1 1 20
1 2 1 35
2 3 2 43
3 4 2 15
I would like to join it to it self to produce a dataframe like this:
wanted_data = {'teamid': [1, 2, 3, 4], 'gameid': [1, 1, 2, 2], 'rebounds': [20, 35, 43, 15],
'teamid_opponent': [2, 1, 4, 3], 'rebound_opponent': [35, 20, 15, 43]}
wanted_df = pd.DataFrame(data=wanted_data)
print(wanted_df)
teamid gameid rebounds teamid_opponent rebound_opponent
0 1 1 20 2 35
1 2 1 35 1 20
2 3 2 43 4 15
3 4 2 15 3 43
In SQL I would just do something like this:
SELECT * from game_df df1 join game_df df2 on df1.gameid = df2.gameid and df1.teamid != df2.teamid
But i haven't been able to find anything in the pandas docs or on here for a way to replicate this in pandas itself. I looked on here and found this link but it isn't quite the same as what I'm trying to do. I've only found examples of trying to join where keys are equal.
Upvotes: 3
Views: 1265
Reputation: 2328
I will also add seeing the answer made me think of a different way to do this as well so I will post that for posterity
Yourdf=game_df.merge(game_df,on='gameid',suffixes =['','_opponent'])
Yourdf[Yourdf.teamid != Yourdf.teamid_opponent]
teamid gameid rebounds teamid_opponent rebounds_opponent
1 1 1 20 2 35
2 2 1 35 1 20
5 3 2 43 4 15
6 4 2 15 3 43
Upvotes: 1
Reputation: 323326
Here is one way use merge
Yourdf=game_df.merge(game_df,on='gameid',suffixes =['','_opponent']).query('teamid!=teamid_opponent')
Out[42]:
teamid gameid rebounds teamid_opponent rebounds_opponent
1 1 1 20 2 35
2 2 1 35 1 20
5 3 2 43 4 15
6 4 2 15 3 43
Upvotes: 5