jovicbg
jovicbg

Reputation: 1553

Replace values in column with column values from other dataframe in Python

I have two dataframes, one with more columns (but two of them are important) and the second with two columns. df1:

col_a  col_b
101     104
102     201
103     301
505     601

df2:

col_a_a col_b_b
420      637
425      643
201      701
601      702

I need the next thing. Lookup at df1['col_b'] and if exist in df2['col_a_a'] then replace it with df2['col_b_b'].

I have tried two ways:

df1['col_b'] = np.where(df1['col_b'] == df2['col_a_a'], df2['col_b_b'], df1['col_b'])

df1.col_b[df1.col_b == df2.col_a_a] = df2.col_b_b

But both of them return me the same error: ValueError: Can only compare identically-labeled Series objects

Desired output:

col_a  col_b
    101     104
    102     701
    103     301
    505     702

Upvotes: 2

Views: 620

Answers (2)

jpp
jpp

Reputation: 164693

Another way, using pd.Series.map:

df1['col_b'] = df1['col_b'].map(df2.set_index('col_a_a')['col_b_b']).fillna(df1['col_b'])

Depending on your data, this may be more efficient.

Upvotes: 1

jezrael
jezrael

Reputation: 862741

Use replace by Series created by set_index:

df1['col_b'] = df1['col_b'].replace(df2.set_index('col_a_a')['col_b_b'])
print (df1)
   col_a  col_b
0    101    104
1    102    701
2    103    301
3    505    702

Upvotes: 2

Related Questions