Reputation: 1180
Here is the problem I am having.
I want to map team codes (contained in main df) with team names contained in dfTeam
.
Here is a sample of main df:
Game Code Play Number Period Number Clock Offense Team Code
0 690002820050901 1 1 900.0 690
1 690002820050901 2 1 NaN 28
2 690002820050901 3 1 NaN 28
3 690002820050901 4 1 NaN 28
4 690002820050901 5 1 NaN 28
5 690002820050901 6 1 NaN 28
6 690002820050901 7 1 826.0 690
7 690002820050901 8 1 NaN 690
8 690002820050901 9 1 NaN 690
I want to change the columns like Offense Team Code
with the team names using this dataframe, named dfTeam
.
Team Code Name Conference Code
0 5 Akron 875
1 8 Alabama 911
2 9 UAB 24312
3 28 Arizona State 905
4 29 Arizona 905
Here is what I currently do, to merge the dataframes, change the column name and delete the columns I don't need:
teamDict = {'Home Team Code':'homeTeamName','Visit Team Code':'visitTeamName','Offense Team Code':'offenseTeamName','Defense Team Code':'defenseTeamName'}
for oldName,newName in teamDict.items():
dfFULL = pd.merge(dfFULL,dfTeam,how='left',left_on=oldName,right_on='Team Code')
dfFULL.rename(columns={'Name':newName},inplace=True)
dfFULL.drop(['Conference Code','Team Code',oldName],axis=1,inplace=True)
This code works, but it is quite slow. My main dataframe has 130k or so rows. Is there a more efficient way to do it ?
Upvotes: 3
Views: 85
Reputation: 59549
You basically want to use dfTeam
as a dictionary to map Team Codes
to Names
. For each column, you can just perform the mapping.
d = dfTeam.set_index('Team Code').Name
for oldName,newName in teamDict.items():
dfFULL[newName] = dfFULL[oldName].map(d)
# Then get rid of all old names
dfFull = dfFull.drop(columns=list(teamDict.keys()))
Upvotes: 2