Hassan Dbouk
Hassan Dbouk

Reputation: 99

Python: adding values of one Dataframe to some rows of another Dataframe

I have a dataframe df1 that holds rows for thousands of trades between a certain country and the United Kingdom. I also have another dataframe df2 that holds a few rows for distances between certain countries and the United Kingdom.

Example:

x1 = [{'partner': "Afghanistan", 'trade_value':100}, 
      {'partner':"Zambia",'trade_value':110}, 
      {'partner': "Germany",'trade_value':120},
      {'partner': "Afghanistan",'trade_value':150},
      {'partner': "USA",'trade_value':1120}]

df1 = pd.DataFrame(x1)

x2 = [{'country': "Afghanistan", 'distance':1000}, 
      {'country': "USA",'distance':1120}];

df2 = pd.DataFrame(x2)

I want to add a new column to df1 called 'distance' and assign each 'partner' in df1 with its appropriate 'distance' from df2.

Note: sometimes a 'partner' country in df1 is repeated due to multiple trades. Also not all distances are available in df2 so I don't mind leaving a cell in df1 empty.

so far I am trying this to no avail:

#Add new column
df1['distance'] = 0;

for index, row in df1.iterrows():
    for index, row2 in df2.iterrows():
        if row['partner'] == row2['country']:
            df1['distance'].replace(row['distance'], row2['distance'],inplace=True);

I am getting all the distance column filled with 1000, which is the made up distance between Afghanistan and the UK, but all other countries are getting the same value (when some countries should not even be getting any value at all)

Upvotes: 3

Views: 4051

Answers (1)

Abhi
Abhi

Reputation: 4233

IIUC you can use .map

x1 = [{'partner':"Afghanistan", 'trade_value':100}, {'partner':"Zambia",'trade_value':110},
      {'partner': "Germany",'trade_value':120},
      {'partner': "Afghanistan",'trade_value':150},{'partner': "USA",'trade_value':1120}]

df1 = pd.DataFrame(x1)

x2 = [{'country': "Afghanistan", 'distance':1000}, {'country': "USA",'distance':1120}]

df2 = pd.DataFrame(x2)

df1['distance'] = df1['partner'].map(df2.set_index('country')['distance'])

print (df1)

       partner  trade_value  distance
0  Afghanistan          100    1000.0
1       Zambia          110       NaN
2      Germany          120       NaN
3  Afghanistan          150    1000.0
4          USA         1120    1120.0

Upvotes: 5

Related Questions