Reputation: 1597
I have 2 pandas data frames.
dictionary1 = {'match_up' : ['1985_1116_1234' , '1985_1116_1475', '1985_1234_1172', '1985_1475_2132', '1985_1242_1325'], \
'result': [1, 1, 0, 0, 1], 'year':[1985,1985,1985,1985,1985] }
dictionary2 = {'team' : [1234 , 1475, 2132, 1172, 1242, 1116 , 1325], 'win_A_B': [0.667, 0.636, 0.621, 0.629, 0.615,0.943, 0.763], \
'year':[1985,1985,1985,1985,1985,1985,1985] }
df1 = pd.DataFrame(dictionary1)
df2 = pd.DataFrame(dictionary2)
df1:
match_up result year
0 1985_1116_1234 1 1985
1 1985_1116_1475 1 1985
2 1985_1234_1172 0 1985
3 1985_1475_2132 0 1985
4 1985_1242_1325 1 1985
df2:
team win_A_B year
1234 0.667 1985
1475 0.636 1985
2132 0.621 1985
1172 0.629 1985
1242 0.615 1985
1116 0.943 1985
1325 0.763 1985
The column value in data frame df1
is a matchup of the column team
in the data frame df2
. The column team
in df2
are all unique values.
I need to combine the above 2 data frames in the following manner:
match_up result year team_A team_B win_A win_B
0 1985_1116_1234 1 1985 1116 1234 0.943 0.667
1 1985_1116_1475 1 1985 1116 1475 0.943 0.636
2 1985_1234_1172 0 1985 1234 1172 0.667 0.629
3 1985_1475_2132 0 1985 1475 2132 0.636 0.621
4 1985_1242_1325 1 1985 1242 1325 0.615 0.763
I know I have already asked similar questions in pandas. I am new to pandas, so please bear with me if I ask some questions like this.
Upvotes: 4
Views: 94
Reputation: 9890
The following will work:
d_teams=pd.DataFrame( [[int(y) for y in x.split('_')[1:]] \
for x in df1.match_up], columns=('team_A', 'team_B') )
merged=pd.concat((df1,d_teams),axis=1)
df2i=df2.set_index('team')
merged['win_A']=df2i.ix[merged.team_A].reset_index().win_A_B
merged['win_B']=df2i.ix[merged.team_B].reset_index().win_A_B
First, we create d_teams
, which is a DataFrame that is composed of the match_up column, split by '_', and turned into ints. We throw away the year, because it's already included in df1, and just keep team_A and team_B. Then we create a merged dataframe by concatenating this with df1.
Next, we create df2i
, which is df2 indexed by team. Then we can just index with merged.team_A or merged.team_B to get the win values. However, we don't want the results to be indexed by those teams, so we reset the index first.
Upvotes: 2