Setting Values from one Dataframe to another

Question

I'm working with this pandas dataframe and would like to essentially set what is in the city column of the match dataframe to the ball_by_ball dataframe if the IDs of the match in the match dataframe is equal to the ID in the ball_by_ball dataframe.

Match:

Ball by ball for all matches:

Data can be accessed https://www.kaggle.com/datasets/patrickb1912/ipl-complete-dataset-20082020

Some information on the dataframe: the match dataframe contains the final results of any particular match ID, while ball by ball contains the information on said match ID.

I tried the following which does "work":

First creating an empty column by

city_ball["city"]=None #city_ball is the same as ball_by_ball

Followed by:

for id in range(len(match["id"])):
  for ID in range(len(city_ball["id"])):
    if match["id"][id] == city_ball["id"][ID]:
        city_ball["city"][ID] = match["city"][id]

But i understand this is a terrible way of implementing it, so I'm looking for suggestions to implement it efficiently.

Peter Leimbigler · Accepted Answer

This looks like a textbook use case for a join or merge, but the first thing I think of is to map each ID to its city using Series.map:

# Build a Series which maps each id to a corresponding city.
# Drop duplicates to ensure each distinct id only appears once
id2city = match.drop_duplicates(subset='id').set_index('id')['city']

# Apply the mapping to each id in the target DataFrame
city_ball['city'] = city_ball['id'].map(id2city)

Setting Values from one Dataframe to another

Answers (1)

Related Questions